Parallel divergence and speciation provide evidence for the role of divergent selection in generating biological diversity. Recent studies indicate that parallel phenotypic divergence may not have the same genetic basis in different geographical locations - “outlier loci” (loci potentially affected by divergent selection) are often not shared among parallel instances of phenotypic divergence. However, limited sharing may be due, in part, to technical issues if false positive outliers occur. Here, we test this idea in the marine snail Littorina saxatilis, which has evolved two partly isolated ecotypes (adapted to crab predation vs. wave action) in multiple locations independently. We argue that if the low extent of sharing observed in earlier studies in this system is due to sampling effects, we expect outliers not to show elevated FST when sequenced in new samples from the original locations, and also not to follow predictable geographical patterns of elevated FST. Following a hierarchical sampling design (within vs. between country), we applied capture sequencing, targeting outliers from earlier studies and control loci. We found that outliers again showed elevated levels of FST in their original location, suggesting they were not generated by sampling effects. Outliers were also likely to show increased FST in geographically close locations, which may be explained by higher levels of gene flow or shared ancestral genetic variation compared to more distant locations. However, in contrast to earlier findings, we also found some outlier types to show elevated FST in geographically distant locations. We discuss possible explanations for this unexpected result.
Raw capture sequencing reads; Location: ANG; Ecotype: C
Raw capture sequencing reads for Littorina saxatilis from the original Swedish location (ANG), "crab" ecotype. Each fastq file represents reads from one individual.
ANG_C_reads.tar.gz
Raw capture sequencing reads; Location: S; Ecotype: C
Raw capture sequencing reads for Littorina saxatilis from the original Spanish location (S), "crab" ecotype. Each fastq file represents reads from one individual.
S_C_reads.tar.gz
Raw capture sequencing reads; Location: S; Ecotype: W
Raw capture sequencing reads for Littorina saxatilis from the original Spanish location (S), "wave" ecotype. Each fastq file represents reads from one individual.
S_W_reads.tar.gz
Raw capture sequencing reads; Location: ANG; Ecotype: W
Raw capture sequencing reads for Littorina saxatilis from the original Swedish location (ANG), "wave" ecotype. Each fastq file represents reads from one individual.
ANG_W_reads.tar.gz
Raw capture sequencing reads; Location: B; Ecotype: C
Raw capture sequencing reads for Littorina saxatilis from the new Spanish location (B), "crab" ecotype. Each fastq file represents reads from one individual.
B_C_reads.tar.gz
Raw capture sequencing reads; Location: B; Ecotype: W
Raw capture sequencing reads for Littorina saxatilis from the new Spanish location (B), "wave" ecotype. Each fastq file represents reads from one individual.
B_W_reads.tar.gz
Raw capture sequencing reads; Location: OCK; Ecotype: C
Raw capture sequencing reads for Littorina saxatilis from the new Swedish location (OCK), "crab" ecotype. Each fastq file represents reads from one individual.
OCK_C_reads.tar.gz
Raw capture sequencing reads; Location: OCK; Ecotype: W
Raw capture sequencing reads for Littorina saxatilis from the new Swedish location (OCK), "wave" ecotype. Each fastq file represents reads from one individual.
OCK_W_reads.tar.gz
Raw capture sequencing reads; Location: T; Ecotype: C
Raw capture sequencing reads for Littorina saxatilis from the original United Kingdom location (T), "crab" ecotype. Each fastq file represents reads from one individual.
T_C_reads.tar.gz
Raw capture sequencing reads; Location: T; Ecotype: W
Raw capture sequencing reads for Littorina saxatilis from the original United Kingdom location (T), "wave" ecotype. Each fastq file represents reads from one individual.
T_W_reads.tar.gz
Raw capture sequencing reads; Location: W; Ecotype: C
Raw capture sequencing reads for Littorina saxatilis from the new United Kingdom location (W), "crab" ecotype. Each fastq file represents reads from one individual.
W_C_reads.tar.gz
Raw capture sequencing reads; Location: W; Ecotype: W
Raw capture sequencing reads for Littorina saxatilis from the new United Kingdom location (W), "wave" ecotype. Each fastq file represents reads from one individual.
W_W_reads.tar.gz
per-locus alignments
Alignments for 253 loci targeted by capture sequencing. Each file represents one locus, and can consist of alignments for a single or for multiple concatenated reference genome contigs (see manuscript). The files are named by these reference genome contigs. More details about each locus (ID in the original study; type of original study; outlier status) can be found in probes.csv. Each individual in the alignment has two pseudo-haplotypes (H1 and H2). Naming of individuals is in accordance with read files and Figure 1 in the manuscript.
alignments.tar.gz
list of capture sequencing probes
Table of 120bp probes used for capture sequencing. ref_contig = ID of targeted contig in L. saxatilis reference genome. start, end: start and end of probe sequence in the reference contig. study: study in which the targeted locus was first identified (EXP = gene expression study; RNA = RNAseq study; RAD = RADseq study; LSD = Littorina sequence database; for references please see main text, e.g. Table 1). type: outlier vs. control locus; SP, SW, UK: locus was outlier (1) or non-outlier (0) in earlier work Spain / Sweden / UK, respectively; probe: probe ID; original_locus: ID of targeted locus in the original work.
probes.csv
calculate_dxy
Python script using the EggLib library to calculate dxy between ecotypes for multiple loci and locations. Needs one fasta file per locus (containing aligned sequences from both ecotypes from multiple locations) and a text file with the names of the loci of interest.
calculate_pi
Python script using the EggLib library to calculate Pi for multiple loci and populations (one population = one ecotype within a location). Needs one fasta file per locus (containing aligned sequences from both ecotypes from multiple locations) and a text file with the names of the loci of interest.
calculate_Fst
R script to calculate Fst per locus and location from the output of calculate_pi.py and calculate_dxy.py.
likelihood_analysis
R script to calculate probabilities that outliers fall above the 80% quantile of the control Fst distribution in different locations, and to obtain log likelihoods and AICs for different models. Needs "quantile_table.txt" (a table of all loci indicating for each location whether they fall above the 80% quantile or not) as input.
quantile_table
Table indicating whether targeted loci showed an Fst estimate above the 80% quantile of the control distribution in the current study. locus = ID of outlier / control locus in original study. contig = location in the current L. saxatilis reference genome. SP / SW / UK = indication whether the locus was an outlier in Spain / Sweden / UK in the original work (0 = non-outlier; 1 = outlier). type = control vs. outlier locus in the original study. S / B / ANG / OCK / T / W = indication whether the locus fell above the 80% quantile of the control Fst distribution in the current work in the six studied locations (TRUE = above, FALSE = below).