Skip to main content

Contrasting signatures of introgression in North American box turtle (Terrapene spp.) contact zones

Cite this dataset

Martin, Bradley T. et al. (2020). Contrasting signatures of introgression in North American box turtle (Terrapene spp.) contact zones [Dataset]. Dryad.


Hybridization occurs differentially across the genome in a balancing act between selection and migration. With the unprecedented resolution of contemporary sequencing technologies, selection and migration can now be effectively quantified such that researchers can identify genetic elements involved in introgression. Furthermore, genomic patterns can now be associated with ecologically relevant phenotypes, given availability of annotated reference genomes. We do so in North American box turtles (Terrapene) by deciphering how selection affects hybrid zones at the interface of species-boundaries and identifying genetic regions potentially under selection that may relate to thermal adaptations. Such genes may impact physiological pathways involved in temperature-dependent sex determination, immune system functioning, and hypoxia tolerance. We contrasted these patterns across inter- and intra-specific hybrid zones that differ temporally and biogeographically. We demonstrate hybridization is broadly apparent in Terrapene, but with observed genomic cline patterns corresponding to species boundaries at loci potentially associated with thermal adaptation. These loci display signatures of directional introgression within intra-specific boundaries, despite a genome-wide selective trend against intergrades. In contrast, outlier loci for inter-specific comparisons exhibited evidence of being under selection against hybrids. Importantly, adaptations coinciding with species-boundaries in Terrapene overlap with climatic boundaries and highlight the vulnerability of these terrestrial ectotherms to anthropogenic pressures.


  • Samples were sequenced on an Illumina Hi-Seq 4000 at 1x100 bp.
  • Reads were demultiplexed and aligned using ipyrad.
  • The "scaffold alignment" (AKA "fulldataset") was mapped to the Terrapene mexicana triunguis reference genome (scaffold-level; GenBank Accession: GCA_002925995.2).
  • The "transcriptome alignment" (AKA "genes") was mapped to the T. m. triunguis reference transcriptome to obtain annotation information.

Usage notes

See the readme.txt file for comments on each of the directories and files.

The Terrapene population codes are as follows:

  • Terrapene carolina
    • T. carolina carolina - Woodland box turtle (EA)
    • T. carolina major - Gulf Coast box turtle (GU)
  • Terrapene mexicana
    • T. mexicana triunguis - Three-toed box turtle (TT)
  • Terrapene ornata
    • T. ornata ornata  - Ornate box turtle (ON) 

Input files are included for the following analyses:

  • ADMIXTURE (via the AdmixPipe pipeline)
  • NewHybrids (via HybridDetective and parallelnewhybrid)
  • TESS3 (via tess3r)
  • Genomic clines
    • BGC (Bayesian Genomic Clines)
  • RDA - Redundancy Analysis

Input files are either in VCF or GENEPOP format. Additionally, the VCF files have been filtered for missing data (50%), limited to 1 SNP per ddRAD locus, and including only bi-allelic sites.

For the RDA, missing data per were imputed per population in one of the included R scripts, as RDA requires no missing data.

NewHybrids was run via the hybridDetective and parallelnewhybrid pipelines in R. The scripts to do so are located at: