Contrasting signatures of introgression in North American box turtle (Terrapene spp.) contact zones
Martin, Bradley T. et al. (2020), Contrasting signatures of introgression in North American box turtle (Terrapene spp.) contact zones, Dryad, Dataset, https://doi.org/10.5061/dryad.brv15dv7k
- Samples were sequenced on an Illumina Hi-Seq 4000 at 1x100 bp.
- Reads were demultiplexed and aligned using ipyrad.
- The "scaffold alignment" (AKA "fulldataset") was mapped to the Terrapene mexicana triunguis reference genome (scaffold-level; GenBank Accession: GCA_002925995.2).
- The "transcriptome alignment" (AKA "genes") was mapped to the T. m. triunguis reference transcriptome to obtain annotation information.
See the readme.txt file for comments on each of the directories and files.
The Terrapene population codes are as follows:
- T. carolina carolina - Woodland box turtle (EA)
- T. carolina major - Gulf Coast box turtle (GU)
- T. mexicana triunguis - Three-toed box turtle (TT)
- T. ornata ornata - Ornate box turtle (ON)
Input files are included for the following analyses:
- ADMIXTURE (via the AdmixPipe pipeline)
- NewHybrids (via HybridDetective and parallelnewhybrid)
- TESS3 (via tess3r)
- Genomic clines
- BGC (Bayesian Genomic Clines)
- RDA - Redundancy Analysis
Input files are either in VCF or GENEPOP format. Additionally, the VCF files have been filtered for missing data (50%), limited to 1 SNP per ddRAD locus, and including only bi-allelic sites.
For the RDA, missing data per were imputed per population in one of the included R scripts, as RDA requires no missing data.
NewHybrids was run via the hybridDetective and parallelnewhybrid pipelines in R. The scripts to do so are located at: https://github.com/btmartin721/R_scripts.