Multi-locus nuclear DNA data were used to delimit species of fringe-toed lizards of the Uma notata complex, which are specialized for living in wind-blown sand habitats in the deserts of southwestern North America, and to infer whether Quaternary glacial cycles or Tertiary geological events were important in shaping the historical biogeography of this group. We analyzed ten nuclear loci collected using Sanger sequencing and genome-wide sequence and single-nucleotide polymorphism (SNP) data collected using restriction-associated DNA (RAD) sequencing. A combination of species discovery methods (concatenated phylogenies, parametric and non-parametric clustering algorithms) and species validation approaches (coalescent-based species tree/isolation-with-migration models) were used to delimit species, infer phylogenetic relationships, and to estimate effective population sizes, migration rates, and speciation times. Uma notata, U. inornata, U. cowlesi, and an undescribed species from Mohawk Dunes, Arizona (U. sp.) were supported as distinct in the concatenated analyses and by clustering algorithms, and all operational taxonomic units were decisively supported as distinct species by ranking hierarchical nested speciation models with Bayes factors based on coalescent-based species tree methods. However, significant unidirectional gene flow (2NM >1) from U. cowlesi and U. notata into U. rufopunctata was detected under the isolation-with-migration model. Therefore, we conservatively delimit four species-level lineages within this complex (U. inornata, U. notata, U. cowlesi, and U. sp.), treating U. rufopunctata as a hybrid population (U. notata x cowlesi). Both concatenated and coalescent-based estimates of speciation times support the hypotheses that speciation within the complex occurred during the late Pleistocene, and that the geological evolution of the Colorado River delta during this period was an important process shaping the observed phylogeographic patterns.
Sanger_sequence_data
Sequencher v4.7 (Gene Codes Corp., Ann Arbor, MI) was used to analyze data quality, trim primer sequences, produce alignments and call heterozygous sites to produce these fasta files, including alignments from Gottscho et al. (2014). See Appendix A for specimen information.
PHASE
PHASE v2.1 (Stephens et al. 2001) and seqPHASE (Flot 2010) were used to determine haplotypes; input and output files provided.
Geneland
Included are input files, R scripts and output files for an analysis of 10 Sanger loci in the R package Geneland v4.0.3 (Guillot 2008; Guillot et al. 2005, 2008).
starBEAST
Included are input xml files and a summary of results for Bayes Factor Delimitation (BFD, Grummer et al. 2014) and *BEAST (Heled and Drummond 2010) analysis of species trees in BEAST v1.8.1 (Drummond et al. 2012).
raw_HiSeq_data1
These data were collected using ddRADseq (Peterson et al. 2012) and sequenced on a single lane of an Illumina HiSeq 2500, UC Riverside. They are already demultiplexed by Illumina index and adapter index. See Appendix A for specimen information.
raw_HiSeq_data2
These data were collected using ddRADseq (Peterson et al. 2012) and sequenced on a single lane of an Illumina HiSeq 2500, UC Riverside. They are already demultiplexed by Illumina index and adapter index. See Appendix A for specimen information.
raw_HiSeq_data3
These data were collected using ddRADseq (Peterson et al. 2012) and sequenced on a single lane of an Illumina HiSeq 2500, UC Riverside. They are already demultiplexed by Illumina index and adapter index. See Appendix A for specimen information.
raw_HiSeq_data4
These data were collected using ddRADseq (Peterson et al. 2012) and sequenced on a single lane of an Illumina HiSeq 2500, UC Riverside. They are already demultiplexed by Illumina index and adapter index. See Appendix A for specimen information.
pyRAD
We used pyRAD v2.1.2 (Eaton 2014) with muscle3.8.31 and usearch7.0.1090 to filter and process raw data files. Two example parameter files are provided. Please see “raw_HiSeq_data” to download the .fastq files.
raxml
We used RAxML v8.1.1 (Stamatakis 2014) to created a maximum-likelihood phylogeny for our concatenated data. Input and output files provided.
beast2
The folder BFD* contains .xml input files and output logs for Bayes Factor Delimitation with genomic data (Leache et al. 2014) implemented in SNAPP/BEAST 2.3.1 (Bryant et al. 2012, Bouckaert et al. 2014). The concatenated folder includes input and output for a concatenated analysis of RAD data in BEAST 2.1.2. The SNAPP folder contains species tree input and output for SNAPP 1.1.5 implemented in BEAST 2.1.2.
smartPCA_Admixture
These two programs are grouped together because they share a common data format. Admixture was described by Alexander et al. (2009), smartPCA by Patterson et al. (2006). There is a folder with the R script used to create the input files, and separate folders for each analysis; R scripts for visualizing results are also provided.
DAPC
We used Discriminant Analysis of Principal Components in the R package adegenet 2.0.0 (Jombart et al. 2010) using the structure (.str) file output from pyRAD. Input file and annotated R script are provided.
gphocs
Included are data files, control files, log files and output files for G-PhoCS v1.2.3 (Gronau et al. 2011). Results summarized in G-PhoCs_results_120715.xlsx.