Widespread admixture blurs population structure and confounds Lake Trout (Salvelinus namaycush) conservation even in the genomic era
Data files
Dec 05, 2024 version files 156.69 MB
Abstract
Intraspecific variation is important for species’ long-term persistence, and a main conservation target. Units below the species level are often identified based on evidence for adaptive divergence and reproductive isolation. For complexchallenging taxa, population genomics has the potential to improve management strategies by facilitating the identification of genetic boundaries and adaptive variation between discrete units. This paper examines intraspecific divergence of Lake Trout (Salvelinus namaycush) in Great Slave Lake (GSL), Canada, using low-coverage whole-genome sequencing data. Specifically, we evaluate genetic differentiation and assess the relationship with morphological, mitochondrial, and putatively adaptive divergence. We show that at least three genetically distinct Lake Trout populations co-occur in GSL and exhibit differences in spatial distribution and body size, with signatures of selection. However, admixture was widespread (60% of the fish). These findings highlight that, even in the era of whole genome sequencing, identifying discrete population units can remain challenging in systems where gene flow among genetically distinct populations is ubiquitous. To give more recognition to this complexity, shifting the focus of management efforts from discrete populations (i.e., intraspecific units) to the area where evolutionary acts are at play could be beneficial to protect species’ resilience and adaptive potential in some natural systems.
README: Widespread admixture blurs population structure and confounds Lake Trout (Salvelinus namaycush) conservation even in the genomic era
https://doi.org/10.5061/dryad.95x69p8tt
Description of the data and file structure
README for "Widespread admixture blurs population structure and confounds Lake Trout (Salvelinus namaycush) conservation even in the genomic era". The following files are included:
- Sample metadata (Individuallevel_metadata_ GSL_LakeTrout_V2.csv)
- NGSadmix Q values (Individuallevel_NGSadmix_ GSL_LakeTrout_V2.csv)
- Mitochondrial DNA sequences (mitochondrialDNA_sequences_GSL_LakeTrout.txt)
- Nucleotide diversity and Tajima’s D (Populationlevel_Diversity_Estimates_GSL_LakeTrout_V2.csv)
Files and variables
File: Individuallevel_metadata__GSL_LakeTrout_V2.csv
Description: Metadata
Variables
- sample: Sample ID
- area: Management area (IW, IE, II, III, IV, V, VI)
- sampled_day: Sampling date: Day
- sampled_month: Sampling date: Month
- sampled_year: Sampling date: Year
- sample_origin: Sample source (who collected the sample)
- water_depth: Depth of capture in meters (m)
- length: Fork length in millimetres (mm)
- weight: Weight in grams (g)
- sex: Sex (Male or Female assessed on the field)
File: Individuallevel_NGSadmix__GSL_LakeTrout_V2.csv
Description: NGSAdmix output
Variables
- sample: Sample ID
- K2_Q90: Genetic cluster with 2 population assumed (K2) - "GCx" for fish with Q>=0.90 with x being the cluster number and "admix" for Q<0.90
- K3_Q90: Genetic cluster with 3 population assumed (K3) - "GCx" for fish with Q>=0.90 with x being the cluster number and "admix" for Q<0.90
- K4_Q90: Genetic cluster with 4 population assumed (K4) - "GCx" for fish with Q>=0.90 with x being the cluster number and "admix" for Q<0.90
- K5_Q90: Genetic cluster with 5 population assumed (K5) - "GCx" for fish with Q>=0.90 with x being the cluster number and "admix" for Q<0.90
- K6_Q90: Genetic cluster with 6 population assumed (K6) - "GCx" for fish with Q>=0.90 with x being the cluster number and "admix" for Q<0.90
- K7_Q90: Genetic cluster with 7 population assumed (K7) - "GCx" for fish with Q>=0.90 with x being the cluster number and "admix" for Q<0.90
- K8_Q90: Genetic cluster with 8 population assumed (K8) - "GCx" for fish with Q>=0.90 with x being the cluster number and "admix" for Q<0.90
- K9_Q90: Genetic cluster with 9 population assumed (K9) - "GCx" for fish with Q>=0.90 with x being the cluster number and "admix" for Q<0.90
- K2_cluster: Genetic cluster with 2 population assumed (K2) - Maximum Q value determine cluster assignment
- K3_cluster: Genetic cluster with 3 population assumed (K3) - Maximum Q value determine cluster assignment
- K4_cluster: Genetic cluster with 4 population assumed (K4) - Maximum Q value determine cluster assignment
- K5_cluster: Genetic cluster with 5 population assumed (K5) - Maximum Q value determine cluster assignment
- K6_cluster: Genetic cluster with 6 population assumed (K6) - Maximum Q value determine cluster assignment
- K7_cluster: Genetic cluster with 7 population assumed (K7) - Maximum Q value determine cluster assignment
- K8_cluster: Genetic cluster with 8 population assumed (K8) - Maximum Q value determine cluster assignment
- K9_cluster: Genetic cluster with 9 population assumed (K9) - Maximum Q value determine cluster assignment
- K2_1: Q values at K2 (cluster #1)
- K2_2: Q values at K2 (cluster #2)
- K3_2: Q values at K3 (cluster #2)
- K3_1: Q values at K3 (cluster #1)
- K3_3: Q values at K3 (cluster #3)
- K4_3: Q values at K4 (cluster #3)
- K4_1: Q values at K4 (cluster #1)
- K4_4: Q values at K4 (cluster #4)
- K4_2: Q values at K4 (cluster #2)
- K5_2: Q values at K5 (cluster #2)
- K5_1: Q values at K5 (cluster #1)
- K5_4: Q values at K5 (cluster #4)
- K5_5: Q values at K5 (cluster #5)
- K5_3: Q values at K5 (cluster #3)
- K6_3: Q values at K6 (cluster #3)
- K6_4: Q values at K6 (cluster #4)
- K6_5: Q values at K6 (cluster #5)
- K6_2: Q values at K6 (cluster #2)
- K6_1: Q values at K6 (cluster #1)
- K6_6: Q values at K6 (cluster #6)
- K7_5: Q values at K7 (cluster #5)
- K7_1: Q values at K7 (cluster #1)
- K7_2: Q values at K7 (cluster #2)
- K7_6: Q values at K7 (cluster #6)
- K7_4: Q values at K7 (cluster #4)
- K7_3: Q values at K7 (cluster #3)
- K7_7: Q values at K7 (cluster #7)
- K8_2: Q values at K8 (cluster #2)
- K8_1: Q values at K8 (cluster #1)
- K8_6: Q values at K8 (cluster #6)
- K8_8: Q values at K8 (cluster #8)
- K8_5: Q values at K8 (cluster #5)
- K8_7: Q values at K8 (cluster #7)
- K8_3: Q values at K8 (cluster #3)
- K8_4: Q values at K8 (cluster #4)
- K9_1: Q values at K9 (cluster #1)
- K9_9: Q values at K9 (cluster #9)
- K9_4: Q values at K9 (cluster #4)
- K9_5: Q values at K9 (cluster #5)
- K9_8: Q values at K9 (cluster #8)
- K9_2: Q values at K9 (cluster #2)
- K9_3: Q values at K9 (cluster #3)
- K9_6: Q values at K9 (cluster #6)
- K9_7: Q values at K9 (cluster #7)
File: Populationlevel_Diversity_Estimates_GSL_LakeTrout_V2.csv
Description: ANGSD output
Variables
- pos: (indexStart,indexStop)(firstPos_withData,lastPos_withData)(WinStart,WinStop)
- Chr: Chromosome number
- WinCenter: Central position of the window-based analysis
- tW: Thetas, estimator: Watterson
- tP: Thetas, estimator: pairwise
- tF: Thetas, estimator: Fuli
- Tajima: Tajima's D
- nSites: Number of sites
- GC: Genetic group at K3 (individuals with Q values>=0.90)
- tW_nsites: Watterson's thetas corrected for the number of sites (tW/nSites)
- tP_nsites: Pairwise's thetas corrected for the number of sites (tP/nSites)
File: mitochondrialDNA_sequences_GSL_LakeTrout.txt
Description: Mitochondrial DNA sequences
Variables
- Sequences for each individual (n=190)
Access information
Other publicly accessible locations of the data:
- Raw genomic data available on NCBI (PRJNA1188166)