In resolving the vertebrate tree of life, two fundamental questions remain: 1) what is the phylogenetic position of turtles within amniotes, and 2) what are the relationships between the three major lissamphibian (extant amphibian) groups? These relationships have historically been difficult to resolve, with five different hypotheses proposed for turtle placement, and four proposed branching patterns within Lissamphibia. We compiled a large cDNA/EST dataset for vertebrates (75 genes for 129 taxa) to address these outstanding questions. Gene-specific phylogenetic analyses revealed a great deal of variation in preferred topology, resulting in topologically ambiguous conclusions from the combined dataset. Due to consistent preferences for the same divergent topologies across genes, we suspected systematic phylogenetic error as a cause of some variation. Accordingly, we developed and tested a novel statistical method that identifies sites that have a high probability of containing biased signal for a specific phylogenetic relationship. After removing putatively biased sites, support emerged for a sister relationship between turtles and either crocodilians or archosaurs, as well as for a caecilian-salamander sister relationship within Lissamphibia, with Lissamphibia potentially paraphyletic.
129 taxa
Zip file of full datasets. Included are 8 files corresponding to 4 datasets 1)NUCL (nucleotide) based dataset and partition file (partitioned by codon n12 and n3), 2) N12 (excluding 3rd codon) dataset and partition file (by gene), 3) AA (amino acid) dataset and partition file (by gene), 4) DEGEN1 (nucleotide-based accounting for codon degeneracy) based dataset with partition file. Note, no taxa are excluded, although in final analyses rogue taxa are removed.
129tax.zip
16 taxa
Zip file of reduced taxon datasets (16 exemplar taxa). Included are 8 files corresponding to 4 datasets 1)NUCL (nucleotide) based dataset and partition file (partitioned by codon n12 and n3), 2) N12 (excluding 3rd codon) dataset and partition file (by gene), 3) AA (amino acid) dataset and partition file (by gene), 4) DEGEN1 (nucleotide-based accounting for codon degeneracy) based dataset with partition file.
16tax.zip
Individual Genes
75 individual gene datasets. All datasets start with the first codon position.
IndividualGenes.zip
31 genes
Zip file of reduced gene datasets (corresponding to turtle question). Included are 8 files corresponding to 4 datasets 1)NUCL (nucleotide) based dataset and partition file (partitioned by codon n12 and n3), 2) N12 (excluding 3rd codon) dataset and partition file (by gene), 3) AA (amino acid) dataset and partition file (by gene), 4) DEGEN1 (nucleotide-based accounting for codon degeneracy) based dataset with partition file. Note, taxon sampling has been reduced by removing rogue taxa.
31genes.zip
26 genes
Zip file of reduced gene datasets (corresponding to Lissamphibia question). Included are 8 files corresponding to 4 datasets 1)NUCL (nucleotide) based dataset and partition file (partitioned by codon n12 and n3), 2) N12 (excluding 3rd codon) dataset and partition file (by gene), 3) AA (amino acid) dataset and partition file (by gene), 4) DEGEN1 (nucleotide-based accounting for codon degeneracy) based dataset with partition file. Note, taxon sampling has been reduced by removing rogue taxa.
26genes.zip
16taxa 31genes
Zip file of reduced gene and 16 taxa datasets (corresponding to turtle question). Included are 8 files corresponding to 4 datasets 1)NUCL (nucleotide) based dataset and partition file (partitioned by codon n12 and n3), 2) N12 (excluding 3rd codon) dataset and partition file (by gene), 3) AA (amino acid) dataset and partition file (by gene), 4) DEGEN1 (nucleotide-based accounting for codon degeneracy) based dataset with partition file.
16tax-31genes.zip
16taxa 26genes
Zip file of reduced gene and 16 taxa datasets (corresponding to Lissamphibia question). Included are 8 files corresponding to 4 datasets 1)NUCL (nucleotide) based dataset and partition file (partitioned by codon n12 and n3), 2) N12 (excluding 3rd codon) dataset and partition file (by gene), 3) AA (amino acid) dataset and partition file (by gene), 4) DEGEN1 (nucleotide-based accounting for codon degeneracy) based dataset with partition file.
16tax-26genes.zip
Simulated Data
Two simulated datasets used in study
SimulatedData.zip
Slow Genes
Data and partition file for the reduced gene dataset, removing the 19 most quickly evolving genes. Partition is by gene.
SlowGenes.zip