Data from: Molecular evidence of introgression between water frog species (Anura: Telmatobiidae) in the high Andes

Fibla, Pablo 1 ; Rojas-Hernández, Noemí1; Méndez, Marco A.1; Véliz, David1

Published Oct 09, 2025 on Dryad. https://doi.org/10.5061/dryad.2jm63xt33

Data files

Oct 09, 2025 version files 6.65 MB

Abstract

Contact zones are classical natural laboratories to study speciation. Several evolutionary outcomes are expected from hybridization depending on the degree of reproductive isolation, ranging from the formation of hybrid swarms in early stages of speciation to the reinforcement of species differences when speciation is complete. The genus Telmatobius is a high Andean, diverse, but poorly known group of neotropical frogs. Although hybridization between diverged species is frequent in amphibians, it has not been observed in this group. Here, we studied hybridization processes among three neighboring Telmatobius species that inhabit different altitudes in the same desertic basin using nuDNA SNP and mitochondrial DNA data. The results suggest that the Chilean population of Telmatobius peruvianus has hybridized to different degrees with the species T. pefauri and T. marmoratus. We detected mitochondrial and nuclear introgression from T. peruvianus to T. pefauri and what appears to be historical introgression between T. marmoratus and T. peruvianus. No first-generation hybrids nor parental genotypes were detected in admixed localities, suggesting that the inferred hybridization processes did not occur recently and several generations of backcrossing have passed in geographical isolation. Instead of representing stable hybrid zones, hybrid populations show a degree of genetic differentiation from parental populations. In this group of amphibians, nearly 2 million years of allopatric divergence have not been enough to develop reproductive isolation between diverged species. Data used in the genetic analyses of Fibla et al. (2025) are provided. These include: a genlight file (nuDNA SNPs matrix) including all samples, two separate genlight files (nuDNA SNPs matrices) representing both parental schemes and a fasta file consisting of D-loop DNA sequences matrix.

Dataset DOI: 10.5061/dryad.2jm63xt33

Description of the data and file structure

Raw SNP data from Dart (50,688 SNPs) were filtered using the dartR library (Mijangos et al., 2022) implemented in the R program (R core Team, 2024). To improve the quality of our dataset and reduce genotyping errors, we retained only one SNP in the reads containing two or more SNPs (39,390 SNPs retained). We eliminated: i) loci with a read depth below five or above 200 (31,234 SNPs retained), ii) loci with < 99% reproducibility (23,926 SNPs retained), iii) monomorphic loci (23,926 SNPs retained), iv) loci with > 10% missing data (12,758 SNPs retained), v) individuals with >5% missing data (93 individuals retained), and vi) all SNPs with a minimum allele frequency (MAF) <1% (11,550 SNPs retained). We also eliminated all loci identified as under selection by using three different approaches (10,516 SNPs retained): i) a method based on likelihood implemented in the outflank function of the dartR library, ii) a Bayesian method implemented in the BayeScan program (Foll et al., 2008), and iii) a method based on the relationship between F_ST and heterozygosity of the fsthet library (Flanagan et al., 2017), implemented in R. Loci with significant departures from Hardy–Weinberg equilibrium in all sampling sites were also removed (8,898 SNPs retained) using the dartR library. Finally, loci showing linkage disequilibrium >0.5 in all sampling sites were filtered with PLINK 2.0 software (Chang et al., 2015). The final SNP dataset (all localities) consisted of 6,328 unlinked SNPs and 93 individuals. SNP datasets including parental schemes by separate underwent the same filtering process.

DNA sequence matrix (Fasta file) consisted of a fragment of the mitochondrial control region (CR; 945-bp approx.) that was amplified using the primers and conditions specified in Fibla et al. (2023). PCR products were sequenced using the Sanger method by Macrogen Inc. (Korea). The sequences obtained were aligned and manually edited in BioEdit v.7.2.0 software (Hall, 1999). The DNA sequence matrix used in the analyses was constructed using the ClustalW algorithm for multiple alignments integrated within BioEdit v.7.2.0.

References

Chang, C. C., C. C. Chow, L. C. A. M. Tellier, S. Vattikuti, S. M. Purcell, and J. J. Lee. 2015. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4: s13742-015-0047-8.

Fibla, P., Rojas‐Hernández, N., Méndez, M. A., and Véliz, D. 2025. Molecular Evidence of Introgression Between Water Frog Species (Anura: Telmatobiidae) in the High Andes. Molecular Ecology, e70122.

Fibla, P., Sáez, P. A., Cruz-Jofré, F and Méndez, M. A. 2023. Drainage network morphology influences population structure and gene flow of the Andean water frog Telmatobius pefauri (Anura: Telmatobiidae) of the Atacama Desert, Northern Chile. Zoological Studies 62: 44.

Flanagan, S. P., and A. G. Jones. 2017. Constraints on the Fst-heterozygosity outlier approach. Journal of Heredity 108: 561–573.

Foll, M., and O. E. Gaggiotti. 2008. A genome scan method to identify selected loci appropriate for both dominant and codominant markers: a bayesian perspective. Genetics 180: 977–993.

Hall, T. A. 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symposium Series 41, no. 41: 95–98.

Mijangos, J. L., B. Gruber, O. Berry, C. Pacioni, and A. Georges. 2022. dartR v2: an accessible genetic analysis platform for Conservation, Ecology and Agriculture. *Methods in Ecology and Evolution *13: 2150–2158.

R Core Team. 2024. R: a language and environment for statistical computing. R Foundation for Statistical Computing.

Files and variables

File: Telmatobius_Fibla_et_al.fas

Description: DNA sequences (D-Loop) matrix, FASTA file.

File: gl_all_locs

Description: SNPs matrix, genlight file, all localities.

File: gl_marmoratus_x_peruvianus

Description: SNPs matrix, genlight file, parental scheme: T. marmoratus x T. peruvianus.

File: gl_peruvianus_x_pefauri

Description: SNPs matrix, genlight file, parental scheme: T. peruvianus x T. pefauri.

Code/software

The DNA sequence matrix (FASTA file) was constructed using the ClustalW algorithm for multiple alignments integrated within BioEdit v.7.2.0.

The SNPs matrices were obtained using the DartR package in R 4.2.2 software. SNPs genlight files can be read using the following commands (in R environment):

install.packages("dartR")

library(dartR)

gl_object<- readRDS(file.choose())

gl_object

Access information

Other publicly accessible locations of the data:

DNA sequences were also deposited in Genbank (GenBank PX380050-PX380096)