VCF files and regression analyses for: Assessing fine-scale pondscape connectivity with amphibian eyes: an integrative approach using genomic and capture-mark-recapture data
Data files
Nov 03, 2023 version files 9.68 GB
-
EC.rar
-
Epidalea_calamita.vcf
-
LB.rar
-
Lissotriton_boscai.vcf
-
Pelophylax_perezi.vcf
-
Pleurodeles_waltl.vcf
-
PP.rar
-
PW.rar
-
README.md
-
TP.rar
-
Triturus_pygmaeus.vcf
Abstract
In the face of habitat loss, preserving functional connectivity is essential to maintain genetic diversity and the demographic dynamics required for the viability of biotic communities. This requires knowledge of the dispersal behavior of target species, which can be modeled as kernels, or probability density functions of dispersal distances at increasing geographic distances. We present an integrative approach to investigate the relationships between genetic connectivity and demographic parameters in organisms with low vagility focusing on five syntopic pond-breeding amphibians. We genotyped 1,056 individuals of two anuran and three urodele species (1,732–3,913 SNPs per species) from populations located in a landscape comprising 64 ponds to characterize fine-scale genetic structure in a comparative framework and combined this genetic data with information obtained in a previous two-year capture-mark-recapture (CMR) study. Specifically, we contrasted graphs reconstructed from genomic data with connectivity graphs based on dispersal kernels and demographic information obtained from CMR data from previous studies and assessed the effects of population size, population density, geographical distances, inverse movement probabilities and the presence of habitat patches potentially functioning as stepping stones on genetic differentiation. Our results suggest a significant influence of local population sizes on patterns of genetic connectivity at small spatial scales. In addition, movement records and cluster-derived kernels provide robust inferences on most likely dispersal paths that are consistent with genomic inferences on genetic connectivity. The integration of genetic and CMR data holds great potential for understanding genetic connectivity at spatial scales relevant to individual organisms, with applications for the implementation of management actions at the landscape level.
README: VCF files and regression analyses for "Assessing fine-scale pondscape connectivity with amphibian eyes: an integrative approach using genomic and capture-mark-recapture data"
https://doi.org/10.5061/dryad.gxd2547sf
Description of the data and file structure
This dataset contains both "raw" unfiltered vcfs files obtained using the Stacks pipeline and R scripts for regression analyses using maximum-likelihood population effects models. In these models, genetic distance is used as response variable and several demographic, geographic and movement related variables as explanatory variables. Data for explanatory variables were obtained in a previous work. The study area is divided in half by a road; analyses were performed with and without cross-road connections between ponds. Scripts that exclude cross-road connections are identified by the suffix "_no_crossroad".
Scripts and all necessary data are compressed in a .rar file for each species so that no treatment of the files or working directory setting is needed. Just open the R script from its own folder and run it.
Five .rar files are provided, one for each species.
Inside each .rar file you may find, depending on the species:
- a .csv file with the suffix "*distances", which contains a vector of the movement distances, in meters, of all displacements recorded for each species. This information is used to generate dispersal kernels.
- a "extrainfo.txt" file which contains information on the pond in which each individual was sampled ("POND" column) and the area of the study site where that pond is found ("SIDE" column). The column for individual IDs is titled "node_label".
- a .csv file with the prefix "nodes*", which contains information of each of the ponds (Pond IDs are listed in the "POOL" column), specifically, the area of the study site in which each is found ("SIDE" column), their geographic location (longitude and latitude in the "X" and "Y" columns respectively), water surface area in square meters ("area" column) and the estimated population of the species of interest (in the column named as the abbreviation of the species (see abbreviations below). For example, for the forlder that contains the Pelophylax perezi script and data, the column that contains population size is called "PP").
- Two R scripts containing the analyses\, one of them not accounting for crossroad connections\, marked with a "_no_crossroad" prefix.
- a "squarematrix.txt" file, containing genetic distances between populaltions.
- a "squarematrix.ind.txt" file, containing the names of the pond IDs for the "squarematrix.txt" file.
Some abbreviations used:
EC or IEC: Epidalea calamita
PP or IPP: Pelpophylax perezi
PW or IPW: Pleurodeles waltl
TP or ITP: Triturus pygmaeus
LB or ILB: Lissotriton boscai
Abbreviations used in the
Sharing/Access information
Demographic, geographic and movement data was obtained in a previous work:
https://link.springer.com/article/10.1007/s10980-022-01520-x
Methods
The vcf files available in this dataset were obtained from tissue samples from five syntopic pond-breeding amphibian species: Epidalea calamita, Triturus pygmaeus, Pelophylax perezi, Lissotriton boscai and Pleurodeles waltl. All species were sampled in the same study area, a dehesa pondscape located in the municipality of Alpedrete, Madrid, Spain. Tissue samples consisted in finger clips plus fin clips for Lissotriton boscai. Tissue was digested with proteinase K and DNA was extracted with the Promega Wizard Kit. Libraries were prepared using 3RAD and size selection was performed with Pippin Prep prior to sequencing. Sequenced reads were treated with Stacks to obtain the vcf files. Geographic-, demographic- and movement-related information were obtained in a previous study performed in the same spatial and temporal frame. Genetic distances, available as genetic matrices were obtained with EDENetworks, using filtered .vcf files from ponds with at least 4 individuals.