Skip to main content
Dryad

Data from: Parallel tagged amplicon sequencing reveals major lineages and phylogenetic structure in the North American tiger salamander (Ambystoma tigrinum) species complex

Data files

Aug 30, 2012 version files 430.91 MB

Abstract

Modern analytical methods for population genetics and phylogenetics are expected to provide more accurate results when data from multiple genome-wide loci are analyzed. We present the results of an initial application of parallel tagged sequencing (PTS) on a next generation platform to sequence thousands of barcoded PCR amplicons generated from 95 nuclear loci and 93 individuals sampled across the range of the tiger salamander (Ambystoma tigrinum) species complex. To manage the bioinformatic processing of this large data set (344,330 reads), we developed a pipeline that sorts PTS data by barcode and locus, identifies high-quality variable nucleotides, and yields phased haplotype sequences for each individual at each locus. Our sequencing and bioinformatic strategy resulted in a genome-wide data set with relatively low levels of missing data and a wide range of nucleotide variation. STRUCTURE analyses of these data in a genotypic format resulted in strongly supported assignments for the majority of individuals into nine geographically defined genetic clusters. Species tree analyses of the most variable loci using a multi-species coalescent model resulted in strong support for most branches in the species tree; however, analyses including more than 50 loci produced parameter sampling trends that indicated a lack of convergence on the posterior distribution. Overall, these results demonstrate the potential for amplicon-based PTS to rapidly generate large-scale data for population genetic and phylogenetic-based research.