Hybrid zone analysis using coalescent-based estimates of introgression and migration in plateau fence lizards (Sceloporus tristichus)
Data files
Jul 04, 2025 version files 49.52 MB
-
BPP.zip
950.95 KB
-
hybridzone.vcf
37.46 MB
-
hzar.csv
763 B
-
mtDNA_775.phy
767.61 KB
-
phylogeography_concatenated.phy
4.37 MB
-
phylogeography.vcf
5.97 MB
-
README.md
3.27 KB
Abstract
Coalescent modeling of hybrid zones can provide novel insights into the historical demography of populations, including divergence times, population sizes, introgression proportions, migration rates, and the timing of hybrid zone formation. We used coalescent analysis to determine whether the hybrid zone between phylogeographic lineages of the Plateau Fence Lizard (Sceloporus tristichus) in Arizona formed recently due to human-induced landscape changes, or if it originated during Pleistocene climatic shifts. Given the presence of mitochondrial DNA from another species in the hybrid zone (Southwestern Fence Lizard, S. cowlesi), we tested for the presence of S. cowlesi nuclear DNA in the hybrid zone as well as reassess the species boundary between S. tristichus and S. cowlesi. The data files supplied here include mtDNA sequences and nuclear DNA collected using ddRADseq for both study species. Files for phylogeographic and hybrid zone analysis are provided.
Dataset DOI: 10.5061/dryad.47d7wm3rz
Description of the data and file structure
Data includes mtDNA sequences and ddRADseq data.
Files and variables
File: hzar.csv
Description: Hybrid zone data for conducting spatial cline analysis in HZAR.
Variables
- locationID: Population
- distance: Distance from northern population in km
- snp2002.Q: 2002 SNP admixture Q value (under K=2)
- snp2002.N: 2002 sample size
- snp2012.Q: 2012 SNP admixture Q value (under K=2)
- snp2012.N: 2012 sample size
- snps2022.Q: 2022 SNP admixture Q value (under K=2)
- snps2022.N: 2022 sample size
- snpsAll.Q: all years combined SNP admixture Q value (under K=2)
- snpsAll.N: all years combined sample size
- mtdna2022.N: 2022 sample size
- mtdna2022.F: 2022 mtDNA frequency of northern haplotype (vs. southern)
- mtdna2002.F: 2002 mtDNA frequency of northern haplotype (vs. southern)
- mtdna2002.N: 2002 sample size
- mtdna2012.F: 2012 mtDNA frequency of northern haplotype (vs. southern)
- mtdna2012.N: 2012 sample size
- mtdnaALL.N: all years combined sample size
- mtdnaALL.F: all years combined mtDNA frequency of northern haplotype (vs. southern)
File: mtDNA_775.phy
Description: MtDNA sequence alignment (ND1 gene).
File: phylogeography.vcf
Description: SNP data for the phylogeographic analysis of S. tristichus and S. cowlesi.
File: phylogeography_concatenated.phy
Description: concatenated ddRADseq data for the phylogenetic analysis of S. tristichus and S. cowlesi. Sites with >10% missing data are removed.
File: hybridzone.vcf
Description: SNP data for the hybrid zone analysis of S. tristichus, all years combined.
File: BPP.zip
Description: ddRADseq data, imap, control files for BPP analyses of introgression and migration separated into seven folders (6 MSC-I; 1 MSC-M). The six MSC-I folders correspond to the six hybrid populations. Each of these six folders contains the files necessary to run the MSC-I model in BPP. The MSC-M folder corresponds to the migration analysis of the parental populations.
Code/software
Phylogenetic analysis with iqtree
*.phy files are ready to run in iqtree:
iqtree -s ND1_775.phy -alrt 1000 -bb 1000
iqtree -s phylogeography_concatenated.phy -alrt 1000 -bb 1000
Admixture analysis of population structure
*.vcf files are used to run admixture. Filtering is required (we used vcftools), then covert to bed using plink2:
vcftools --vcf *.vcf --max-missing 0.5 --max-alleles 2 --maf 0.01 --thin 50 --recode --out filtered
plink2 --vcf filtered.recode.vcf --allow-extra-chr --make-bed --out final
admixture run:
for k in {1..8}; do
` admixture final.bed –seed=$RANDOM –cv=50 -C 0.0001 $k >results$k.txt`
` done`;
BPP analysis of introgression and migration
bpp --cfile bpp.ctl
Access information
Other publicly accessible locations of the data:
- Mitochondrial DNA sequences: Genbank accessions PQ901620–PQ901697 and PV462017–PV462129.
- Nuclear ddRADseq data: NCBI SRA Accession PRJNA1211730
Mitochondrial DNA: The mitochondrial ND1 protein-coding gene (969 bp) was amplified and sequenced using standard PCR methods. New sequences were combined with previously published data, which included additional hybrid zone samples and broad-scale sampling of other species. Genbank numbers for each sample are listed in the supplemental table associated with the manuscript.
Nuclear DNA: The nuclear data was collected using double digestion restriction site associated DNA sequencing (ddRADseq). All samples collected from 2002, 2012, and 2022 were assembled together (335 hybrid zone samples). For the phylogeographic analysis, a total of 56 samples were used with species assignments determined using mtDNA-based phylogenetic placement.