Skip to main content

Demographic history has shaped the strongly differentiated corkwing wrasse populations in Northern Europe

Cite this dataset

Mattingsdal, Morten et al. (2022). Demographic history has shaped the strongly differentiated corkwing wrasse populations in Northern Europe [Dataset]. Dryad.


Understanding the biological processes involved in genetic differentiation and divergence between populations within species is a pivotal aim in evolutionary biology. One particular phenomenon that requires clarification is the maintenance of genetic barriers despite the high potential for gene flow in the marine environment. Such patterns have been attributed to limited dispersal or local adaptation, and to a lesser extent to the demographic history of the species. The corkwing wrasse (Symphodus melops) is an example of a marine fish species where regions of particular strong divergence are observed. One such genetic break occurred at a surprisingly small spatial scale (FST ~0.1), over a short coastline (<60 km) in the North Sea-Skagerrak transition area in southwestern Norway. Here, we investigate the observed divergence and purported reproductive isolation using genome resequencing. Our results suggest that historical events during the post-glacial recolonization route can explain the present population structure of the corkwing wrasse in the northeast Atlantic. While the divergence across the break is strong, we detected ongoing gene flow between populations over the break suggesting recent contact or negative selection against hybrids. Moreover, we found few outlier loci and no clear genomic regions potentially being under selection. We concluded that neutral processes and random genetic drift e.g., due to founder events during colonization have shaped the population structure in this species in Northern Europe. Our findings underline the need to take into account the demographic process in studies of divergence processes.


Sixty-five corking wrasses were sampled from eight coastal locations from three regions: the British Isles, western and southern Scandinavia (Table 1). Samples from southern Norway were collected by beach seine, while those from the west coast of Norway, Sweden and the British Isles were collected by fish pots, as described in (Blanco Gonzalez et al., 2016). Muscle tissues were taken from fresh or frozen specimens and stored in 96% ethanol prior to DNA extraction. Total genomic DNA was extracted with the DNeasy kit (Qiagen) or the E.Z.N.A. Tissue DNA kit (Omega Bio-Tek) and resuspending the DNA in TE buffer. The extractions were analyzed with Qubit (Thermo Fisher Scientific) for assessment of the DNA quality and concentration. After normalization to 1,200 ng with Qiagen EB buffer (10 mM Tris-cl; pH = 8.0) the samples were fragmented to ~350 bp using a Covaris S220 (Life Technologies). Library construction was performed using the Illumina TruSeq DNA PCR Free protocol and checked on Bioanalyser High sensitivity chip and Tapestation (both Agilent) followed by Kapa Biosystems qPCR assay for Illumina libraries quantification.

Whole-genome resequencing was conducted on the Illumina HiSeq platform, generating 2 × 125 bp paired-end reads to an average depth of ~9.16× per sample (595× in total across the 65 sample libraries). The mean read insert size across samples was 347 (range: 246–404). Reads were mapped to the corkwing wrasse reference genome assembly (Mattingsdal et al., 2018) using bwa-mem (v0.7.5a; Li & Durbin, 2009) followed by duplicate removal by Picard ( Single nucleotide polymorphisms (SNPs) were called across all samples with freebayes (v1.0.2-33; Garrison & Marth, 2012), using the following quality control criteria: (a) quality >40; (b) minimum and maximum read depth of ×4 and ×30; (c) maximum 5% missing genotypes; (d) minimum minor allele count of 3 (MAF >2%). Two data sets were made: (a) all SNPs with ancestral states and (b) a thinned data set keeping random SNPs equally spaced by 10,000 bp and excluding rare variants (MAF >2%, thinned with “–bp-space 10,000”).

The ancestral allele states were inferred using whole-contig alignments between the corkwing and ballan wrasse (L. bergylta) genome assemblies (Lie et al., 2018; Mattingsdal et al., 2018) constructed by last (v923; Frith, Hamada, & Horton, 2010); both species are members of the Labridae family. First, the genomes were indexed specifying the “YASS” and “R11” options, optimizing for long and weak similarities and masking low-complexity regions. Then, a pairwise genome-wide alignment between corkwing- and ballan wrasses was made, setting minimum E-value to 0.05 and maximum matches per query position = 100. The “last-split” function was run twice to ensure 1-1 alignments. The multiple alignments were converted to bam format and SNP positions in the corkwing wrasse genome used to extract “genotypes” in the corkwing and ballan wrasse alignment using samtools and bcftools (Li et al., 2009). The inferred ancestral states were manually controlled and plink v1.90b3.40 (Purcell et al., 2007) was used to annotate the ancestral state as the reference allele. Missing data were imputed and phased using beagle default settings (Browning & Browning, 2013). To elucidate demographic relationships between the populations, we searched for identical-by-decent (IBD) haplotypes inferred by beagle (Browning & Browning, 2013), which accounts for haplotype phase uncertainty.

Usage notes

freebayes - ALL.VAR - SNP dataset of the corkwing wrasse in Northern Europe





Southern population: EG*, AR*, TV*, GF*

Western population: SM*, NH*, ST*

British Isles population: ARD*


Norges Forskningsråd, Award: 234328

Norges Forskningsråd, Award: 280453

Svenska Forskningsrådet Formas