Skip to main content

Data from: Shared ancestral polymorphism and chromosomal rearrangements as potential drivers of local adaptation in a marine fish

Cite this dataset

Cayuela, Hugo; Rougemont, Quentin; Bernatchez, Louis (2020). Data from: Shared ancestral polymorphism and chromosomal rearrangements as potential drivers of local adaptation in a marine fish [Dataset]. Dryad.


Gene flow has tremendous importance on local adaptation, by influencing the fate of de novo mutations, maintaining standing genetic variation, and driving adaptive introgression. Furthermore, structural variation as chromosomal rearrangements may facilitate adaptation despite high gene flow. However, our understanding of evolutionary mechanisms impending or favoring local adaptation in the presence of gene flow is still limited to a restricted number of study systems. In this study, we examined how demographic history, shared ancestral polymorphism, and gene flow among glacial lineages contribute to local adaptation to sea conditions in a marine fish, the capelin (Mallotus villosus). We first assembled a 490 Mbp draft genome of M. villosus to map our RAD sequence reads. Then, we used a large dataset of genome-wide single nucleotide polymorphisms (25,904 filtered SNPs) genotyped in 1,310 individuals collected from 31 spawning sites in the northwest Atlantic. We reconstructed the history of divergence among three glacial lineages and showed that they likely diverged from 3.8 to 1.8 MyA and experienced secondary contacts. Within each lineage, our analyses provided evidence for large Ne and high gene flow among spawning sites. Within the NWA lineage, we detected a polymorphic chromosomal rearrangement leading to the occurrence of three haplogroups. Genotype-environment associations revealed molecular signatures of local adaptation to environmental conditions prevailing at spawning sites. Our study also suggests that, both shared polymorphism among lineages, resulting from standing genetic variation or introgression, and chromosomal rearrangements may contribute to local adaptation in the presence of high gene flow.


We provide the filtered vcf of GBS data and the environmental variables (bottom temperature and chlorophyll) used in our study. Below, we give information about sampling area and molecular analyses.

A total of 1,359 capelins were sampled from 31 sites in the Northwest Atlantic, both in Canadian and Greenland waters, which were expected to include representatives from the three lineages according to a previous study using mitochondrial DNA and microsatellite genetic markers (Dodson et al. 2007). We sampled 25 spawning sites within the presumed NWA lineage including three demersal shallow-water sites, three demersal deep-water sites, and 18 beach spawning sites. These sites were located in four geographic regions : Kuururjuaq, Labrador, Newfoundland, and St. Lawrence. In parallel, we sampled two sites (beach-spawning fishes) within the presumed range of the ARC lineage and four offshore sampling sites in the range of the GRE lineage. The median sample size was 46.5 (range: 19 to 50) individuals per site. Fish were collected during the breeding period (captured by hand at beach spawning sites and using nets at demersal spawning sites), sexed, and a piece of fin was preserved in RNAlater.

The DNA extraction procedure follows protocols fully described elsewhere (Moore et al. 2017; Rougemont et al. 2019) and is described briefly below. DNA was extracted with a salt-based method and an RNAse A (Qiagen) treatment was applied following the manufacturer’s recommendation. DNA quality was assessed using gel electrophoresis. DNA was quantified using a NanoDrop spectrophotometer (Thermo Scientific) and then using QuantiT Picogreen dsDNA Assay Kit (Invitrogen). Concentration of DNA was normalized to 20 ng/μl. Libraries were constructed following a double-digest RAD (restriction site-associated DNA sequencing; Peterson et al. 2012) protocol modified from Mascher et al. (2013). Genomic DNA was digested with two restriction enzymes (PstI and MspI, Poland et al. 2013) by incubating at 37°C for two hours followed by enzyme inactivation by incubation at 65°C for 20 min. Sequencing adaptors and a unique individual barcode were ligated to each sample using a ligation master mix including T4 ligase. The ligation reaction was completed at 22°C for 2 hours followed by 65°C for 20 min to deactivate the enzymes. Samples were pooled in multiplexes of 48 individuals, ensuring that individuals from each sampling location were sequenced as part of at least six different multiplexes to avoid pool effects. Libraries were size-selected using a BluePippin prep (Sage Science), amplified by PCR and sequenced on the Ion Proton P1v2 chip (producing single-end reads with 80 million bp per chip). Eighty-two individuals were sequenced per chip.