Genomic survey of edible cockle (Cerastoderma edule) in the Northeast Atlantic: a baseline for sustainable management of its wild resources
Data files
Nov 30, 2021 version files 140.08 MB
-
213DivN.txt
-
263DivS.txt
-
554DivCE.txt
-
8021NeuCE.txt
-
8570NeuN.txt
-
8650NeuS.txt
-
allN.txt
-
allS.txt
-
CeEdMac.txt
-
EnvCE_Rep.txt
-
EnvCE_Sum.txt
-
EnvCE_Win.txt
-
FreqCe554Mpob.txt
-
FreqCe8021Mpob.txt
-
FreqCeMpob.txt
-
README_file_description_Dryad_Vera_et_al_21.txt
Abstract
Knowledge on how environmental factors shape the genome of marine species is crucial for sustainable management of fisheries and wild populations. The edible cockle (Cerastoderma edule) is a marine bivalve distributed along the Northeast Atlantic coast of Europe and is an important resource from both commercial and ecological perspectives. We performed a population genomics screening using 2b-RAD genotyping on 9,309 SNPs localised in the cockle's genome on a sample of 536 specimens pertaining to 14 beds in the Northeast Atlantic Ocean to determine the genetic structure with regard to environmental variables. Larval dispersal modelling considering species behaviour and interannual / interseasonal variation in ocean conditions was carried out as an essential background to which compare genetic information. Cockle populations in the Northeast Atlantic displayed low but significant geographical differentiation between populations (FST = 0.0240; P < 0.001), albeit not across generations. We identified 742 and 36 outlier SNPs related to divergent and balancing selection in all the geographical scenarios inspected, and sea temperature and salinity were the main environmental drivers suggested. Highly significant linkage disequilibrium was detected at specific genomic regions against the very low values observed across the whole genome, suggestive of selective sweeps. Two main genetic groups were identified, northwards and southwards of French Brittany, in accordance with the larval dispersal modelling, which suggested a barrier for larval dispersal linked to the Ushant front. Further genetic subdivision was observed using outlier loci and considering larval behaviour. The northern group was divided into the Irish/Celtic Seas and the English Channel/North Sea, while the southern group was divided into three subgroups. This information represents the baseline for management of cockles, designing conservation strategies, founding broodstock for depleted beds, and producing suitable seed for aquaculture production.
Methods
Single Nucleotide Polymorphism (SNP) genotyping
Total DNA was extracted from gill tissue samples using the e.Z.N.A. E-96 mollusc DNA kit (OMEGA Bio-tech), following manufacturer recommendations. SNP identification and selection, as well as genotyping and validation protocols followed those described by Maroso et al. (2019). Briefly, AlfI IIb restriction enzyme (RE) was used to construct the 2b-RAD libraries, which were evenly pooled for sequencing in Illumina Next-seq including 90 individuals per run. The recently assembled cockle's genome (794 Mb; Bruzos et al., unpublished data) was used to align reads from each individual using Bowtie 1.1.2 (Langmead et al., 2009), allowing a maximum of three mismatches and a unique valid alignment (-v 3 -m 1). Individuals with < 250,000 reads were discarded. STACKS 2.0 (Catchen et al. 2013) was then used to call SNPs and genotype a common set of markers in the sample set, applying the marukilow model with default parameters in the gstacks module of Stacks 2.0. This SNP panel was further filtered by applying the following criteria: i) genotyped in > 60% individuals in the total sample; ii) minimum allele count (MAC) ≥ 3 in the total sample; iii) conformance to Hardy-Weinberg equilibrium within each sample (HWE) across the whole collection; i.e. loci with significant deviation from HWE (P < 0.05) in more than 25% of samples were removed; and iv) selection of the most polymorphic SNP in each RAD-tag.
Usage notes
Vera et al. Genomic survey of edible cockle (Cerastoderma edule) in the Northeast Atlantic: a baseline for sustainable management of its wild resources
Description of files deposited in Dryad:
- 9 Genepop input files:
CeEdMac.txt: Genepop input file including all the beds and markers used in the study (9,309 markers). Bed codes are shown in Table 1 of the manuscript.
8021NeuCE.txt: Genepop input file including all the beds with the 8,021 neutral markers identified. Bed codes are shown in Table 1 of the manuscript.
554DivCE.txt: Genepop input file including all the beds with the 554 divergent outliers detected. Bed codes are shown in Table 1 of the manuscript.
allN.txt: Genepop input file including the beds from North group and all markers used in the study (9,309 markers). Bed codes are shown in Table 1 of the manuscript.
8570NeuN.txt: Genepop input file including the beds from North group with the 8,570 neutral markers identified. Bed codes are shown in Table 1 of the manuscript.
213DivN.txt: Genepop input file including the beds from North group with the 213 divergent outliers detected. Bed codes are shown in Table 1 of the manuscript.
allS.txt: Genepop input file including the beds from South group and all markers used in the study (9,309 markers). Bed codes are shown in Table 1 of the manuscript.
8650NeuS.txt: Genepop input file including the beds from South group with the 8,650 neutral markers identified. Bed codes are shown in Table 1 of the manuscript.
263DivS.txt: Genepop input file the beds from South group with the 263 divergent outliers detected. Bed codes are shown in Table 1 of the manuscript.
- 6 Vegan input files (spatial/environmental information is also found in Supplementary Table 1):
EnvCE_Rep.txt: Vegan input file including spatial and environmental information for the reproductive period (from April to August) of each value in all the studied beds. Bed codes are shown in Table 1 of the manuscript.
EnvCE_Win.txt: Vegan input file including spatial and environmental information for the winter season (from January to March) of each value in all the studied beds. Bed codes are shown in Table 1 of the manuscript.
EnvCE_Sum.txt: Vegan input file including spatial and environmental information for the summer season (from July to September) of each value in all the studied beds. Bed codes are shown in Table 1 of the manuscript.
FreqCeMpob.txt: Vegan input file including genetic information (i.e. minimum allele frequency within each bed) for all the beds and markers in the study (9,309 markers). Bed codes are shown in Table 1 of the manuscript.
FreqCe8021Mpob.txt: Vegan input file including genetic information (i.e. minimum allele frequency within each bed) for all the beds and neutral markers in the study (8,021 markers). Bed codes are shown in Table 1 of the manuscript.
FreqCe554Mpob.txt: Vegan input file including genetic information (i.e. minimum allele frequency within each bed) for all the beds and divergent outliers in the study (554 markers). Bed codes are shown in Table 1 of the manuscript.