Data from: Extensive admixture among karst-obligate salamanders reveals evidence of recent divergence and gene exchange through aquifers
Data files
May 23, 2024 version files 163.27 MB
-
gprobmean.csv
159.80 MB
-
IBD_BLMM.RData
3.36 MB
-
locality_dat.csv
1.68 KB
-
predictors_dbmem.csv
101.94 KB
-
README.md
1.72 KB
Nov 28, 2024 version files 1 GB
-
gprobmean.csv
159.80 MB
-
IBD_BLMM.RData
843.89 MB
-
locality_dat.csv
1.68 KB
-
predictors_dbmem.csv
101.94 KB
-
README.md
1.94 KB
Dec 18, 2024 version files 1 GB
-
gprobmean.csv
159.80 MB
-
IBD_BLMM.RData
843.89 MB
-
locality_dat.csv
1.68 KB
-
predictors_dbmem.csv
101.94 KB
-
README.md
2.10 KB
Abstract
Karst ecosystems often contain extraordinary biodiversity, but the complex underground aquifers of karst regions present challenges for assessing and conserving stygobiont diversity and investigating their evolutionary history. We examined the karst-obligate salamanders of the Eurycea neotenes species complex in the Edwards Plateau region of central Texas using population genomics data to address questions about population connectivity and the potential for gene exchange within the underlying aquifer system. The Eurycea neotenes species complex has historically been divided into three nominal species, but their status, and spatial extent of species ranges, have remained uncertain. We discovered evidence of extensive admixture within the species complex and with adjacent lineages. We observed relatively low levels of differentiation among all sampling localities which supports the hypothesis of recent divergence. Nominal taxonomy, aquifer region and geography accounted for a modest amount of the overall population genomic variation, but these predictors were largely confounded and difficult to disentangle. Importantly, the taxonomy of the three nominal species does not reflect the admixture apparent in clustering analyses. Inference of migration events revealed a complex pattern of gene exchange, suggesting that Eurycea salamanders have a dynamic history of dispersal through the aquifer system. These results highlight the need for greater understanding of how stygobiont populations are connected via dispersal and gene exchange through karst aquifers. These results also highlight the applicability of population genomics data as a powerful lever for investigating connectivity among populations in systems where direct detection of dispersal paths is difficult, as in underground, aquatic systems.
https://doi.org/10.5061/dryad.3r2280gpx
Files included here contain scripts and data for partitioning population genomic variation using RDA and a Bayesian Linear Mixed Model.
Description of the data and file structure
gprobmean.csv is the matrix of posterior genotype estimates for 607 salamanders for 16,094 SNP loci
locality_dat.csv contains information about sampling localities
predictors_dbmem.csv contains categorical predictors for nominal taxon, major aquifer, and eight Moran’s Eigenvector maps for all individuals
RDAscript.R is a script to perform Redundancy Analyses using the gprobmean.csv and predictors_dbmem.csv files described above
IBDanalyses_BLMM.R is the script to perform Bayesian Linear Mixed Model analyses using JAGS and includes code for a function for running these analyses. It is to be used with the RData object IBD_BLMM.RData
IBD_BLMM.RData includes the matrix of pairwise, linearized Fst values, the matrix of pairwise geographic distances between localities, and indexing matrices for the categorical predictors: nominal taxonomy and major aquifer. This object also includes all results files for all analyses, though these can be re-run with the above script. To be loaded with IBDanalyses_BLMM.R
interactivePCA_1v2.html is an interactive version of the Principal Components Analysis that can be viewed in a web browser.
Sharing/Access information
The DNA sequence data analysed used to generate the posterior genotype estimates have been archived on NCBI’s SRA (PRJNA1057889):
Change log:
compared to previous, no data has changed.
IBDanalyses_BLMM.R has been updated to include calculation of wAIC and LOOIC (leave-one-out information criteria) for all of the Bayesian linear mixed models.
file added: interactivePCA_1v2.html
Matrix of posterior genotype probabilities for 607 salamanders at 16,094 SNP loci generated by genotyping-by-sequencing methods (similar to ddRAD.seq methods). The hierarchical Bayesian clustering algorithm ENTROPY was used to generate these genotype probabilities. A locality file is included along with a script and matrix of predictors for Redundancy Analysis and script and data object for Bayesian Linear Mixed Model are included.