A decision-making framework to maximize the evolutionary potential of populations: Genetic and genomic insights from the common midwife toad (Alytes obstetricans) at its range limits
Data files
Jul 27, 2024 version files 766.20 MB
-
Barratt_et_al_2024_Heredity_data.zip
766.19 MB
-
README.md
5.66 KB
Abstract
Anthropogenic habitat modification and climate change are fundamental drivers of biodiversity declines, reducing the evolutionary potential of species, particularly at their distributional limits. Supportive breeding or reintroductions of individuals are often made to replenish declining populations, sometimes informed by genetic analysis. However, most approaches utilised (i.e. single locus markers) do not have the resolution to account for local adaptation to environmental conditions, a crucial aspect to consider when selecting donor and recipient populations. Here, we incorporate genetic (microsatellite) and genome-wide SNP (ddRAD-seq) markers, accounting for both neutral and putative adaptive genetic diversity, to inform the conservation management of the threatened common midwife toad, Alytes obstetricans at the northern and eastern edges of its range in Europe. We find geographically structured populations (n=4), weak genetic differentiation and fairly consistent levels of genetic diversity across localities (observed heterozygosity and allelic richness). Categorising individuals based on putatively adaptive regions of the genome showed that the majority of localities are not strongly locally adapted. However, several localities present high numbers of private alleles in tandem with local adaptation to warmer conditions and rough topography. Combining genetic diversity and local adaptations with estimates of migration rates, we develop a decision-making framework for selecting donor and recipient populations which maximises the geographic dispersal of neutral and putatively adaptive genetic diversity. Our framework is generally applicable to any species, but especially to amphibians, so armed with this information, conservationists may avoid the reintroduction of unsuitable/maladapted individuals to new sites and increase the evolutionary potential of populations within species.
https://doi.org/10.5061/dryad.x69p8czt6
This DRYAD data repository contains scripts and data to repeat the analyses in Barratt et al. (2024). We provide the scripts and raw genotypes to perform analyses for our Alytes obstetricans dataset.
Description of the data and file structure
DATA: Barratt_et_al_2024_Heredity_data.zip contains a Readme.txt file along with the following folders:
-genotypes_SNP_microsatellites-
Contains microsatellite and SNP genotypes in a variety of formats for downstream analysis (specified in each script in the accompanying Zenodo package). Files included in this directory are:
- Alytes_msats.gen, Alytes_SNPs.gen - Genepop format
- Alytes_SNPs.lfmm - LFMM object format
- Alytes_SNPs.ped, Alytes_SNPs.map, Alytes_SNPs.raw - Plink format (NAs in Alytes_SNPs.raw represent missing data)
- Alytes_SNPs.bed, Alytes_SNPs.bim, Alytes_SNPs.fam, Alytes_SNPs.gen - binary Plink format
Admixture
Contains ADMIXTURE outputs for k=2-10 as well as all files for plotting. Files included in this directory are:
- Alytes.X.P, Alytes.X.Q - Admixture output P and Q matrices for each run of k between 2 and 10
- pop_labels.csv - labels of all individual samples for plotting
- RESULTS_Alytes_admixture_cleaned.txt - a summary of all CV values from Admixture between k=2-10
ADZE
Contains Alytes_paramfile.txt for running ADZE (to rarefy private allele calculations from SNP data). Files included in this directory are:
- Alytes_paramfile.txt - ADZE input parameter file
EEMS
Contains input files for running EEMS. Also included are the river shapefiles and German administrative boundaries shapefiles. Files included in this directory are:
- 500_Alytes_chain1r.ini, 500_Alytes_chain2r.ini - Initialisation files for EEMS (2 MCMC chains)
- Alytes.coord, Alytes.count, Alytes.diffs, Alytes.outer - EEMS files for running analyses (coordinates, SNP counts, average dissimilarity matrix and coordinates of study area respectively)
- DEU_adm0.shp, DEU_adm0.shx, DEU_adm0.prj, DEU_adm0.dbf, DEU_adm0.csv, cpg - Administrative boundary GIS shape files for Germany
- EU_Copernicus_rivers.shp, EU_Copernicus_rivers.shx, EU_Copernicus_rivers.prj, EU_Copernicus_rivers.dbf, EU_Copernicus_rivers.cpg, EU_Copernicus_rivers.qmd - GIS shape files for rivers in the study area
env_data
Environmental predictor data extracted for each sample, for use with GEA analyses, as well as rasters of the four environmental predictors used in the study. Files included in this directory are:
- cti.asc, tri.asc, prec_warmest_quarter.asc, temp_warmest_month.asc - raster files of four predictors used in GEA analyses
- predictor_sample_information_GEA.csv - extracted values from the four rasters for each sample stored in .csv format
- Alytes_samples.csv - geographic coordinates of all samples used for extraction of raster values
Genetic_diversity
Contains genetic diversity metrics per locality for microsatellites (GD_msats.csv) and SNPs (GD_all_sites.csv). Files included in this directory are:
- GD_microsatellites.csv - microsatellite derived genetic diversity metrics (population, site number, Long (y), lat (x), private alleles (PA), Allelic richest (Ar), Observed heterozygosity (Ho), Expected heterozygosity (He)
- GD_SNPs.csv - SNP derived genetic diversity metrics (population, site number, Long (y), lat (x), private alleles (PA), Allelic richest (Ar), Observed heterozygosity (Ho), Expected heterozygosity (He)
IBD_test
Contains pairwise FST distances between localities (standard and Rousset’s FST), and geographic distances for testing for Isolation By Distance. Files included in this directory are:
-
msat_distances.csv, msat_fst.csv, msat_fst_rousset.csv - pairwise distance matrices of geographic distance, standard FST and Rousset’s FST (using microsatellite data)
-
SNP_distances.csv, SNPs_fst.csv, SNPs_fst_rousset.csv - pairwise distance matrices of geographic distance, standard FST and Rousset’s FST (using SNP data)
RDA
Contains imputed genotype SNP data. Files included in this directory are:
- Alytes_imp.geno - imputed data in geno format
- Alytes_imp.lfmm - imputed data in lfmm format
- Alytes_imp.csv - - imputed data in csv format
Stacks
Data for exploration of parameter ranges when processing genomic data to create output files for downstream analysis. Files included in this directory are:
- -denovo_map_test_parameter_ranges-.csv - a .csv file containing the combinations of parameter ranges to test for Stacks (to optimise the dataset pre-analysis)
- denovo_full_popmap - the popmap file for 70 individuals (i..e a full analysis)
- denovo_test_popmap_n=8 - the popmap file for a subset of 8 individuals for optimising parameters
SCRIPTS: Barratt_et_al_2024_Heredity_scripts.zip contains the following scripts in two directories, linked to each of the analyses in the data folders described above. All scripts are annotated step by step
R_scripts
plot_GD.R
DAPC_microsatellites.R
DAPC_SNPs.R
extract_env_predictor_data.R
FST.R
GEA_adaptive_diversity.R
GEA_pop_structure_impute_missing_data.R
GEA_RDA_candidate_loci_categorising_indvs.R
GEA_RDA.R
IBD_mantel_test.R
plot_ADMIXTURE.R
plot_EEMS.R
bash_scripts
Stacks_02_make_EEMS_files_plink.sh
Stacks_01c_denovo_map_full.sh
Stacks_01b_extract_results.sh
Stacks_01a_denovo_map_test.parameters.sh
run_EEMS.sh
run_all_admixture.sh
admixture.sh
This is a novel dataset generated form sampling Alytes obstetricans across its northern and Eastern range limits in Germany and Belgium. Our dataset represents 38 unique localities and contains 467 individuals genotyped at up to 9 microsatelite loci, and 70 individuals with ddRAD-seq data (our analyses are based on 8650 SNPs.