Data from: Anchored hybrid enrichment for massively high-throughput phylogenomics
Data files
May 08, 2012 version files 301.05 MB
-
Lemmon-etal2012_Alignments.zip
-
Lemmon-etal2012_Contigs.zip
-
Lemmon-etal2012_Phylogenetics_BEST_Amniote_treeDist.zip
-
Lemmon-etal2012_Phylogenetics_BEST_Amniote.zip
-
Lemmon-etal2012_Phylogenetics_BEST_Tetrapod_treeDist.zip
-
Lemmon-etal2012_Phylogenetics_BEST_Tetrapod.zip
-
Lemmon-etal2012_Phylogenetics_BUCKy.zip
-
Lemmon-etal2012_Probes.zip
-
Lemmon-etal2012_Scripts.zip
-
Lemmon-etal2012_SupplementalMaterials.pdf
-
Lemmon-etal2012_SupplementalTables.xls
-
README_for_Lemmon-etal2012_Alignments.txt
-
README_for_Lemmon-etal2012_Probes.txt
-
README_for_Lemmon-etal2012_Scripts.txt
Abstract
The field of phylogenetics is on the cusp of a major revolution, enabled by new methods of data collection that leverage both genomic resources and recent advances in DNA sequencing. Previous phylogenetic work has required labor-intensive marker development coupled with single-locus PCR and DNA sequencing on a clade-by-clade and marker-by-marker basis. Here, we present a new, cost-efficient, and rapid approach to obtaining data from hundreds of genes for potentially hundreds of individuals for deep and shallow phylogenetic studies. Specifically, we designed probes for target enrichment of >500 loci in highly-conserved anchor regions of vertebrate genomes (flanked by less conserved regions) from five model species and tested enrichment efficiency in non-model species up to 254 million years divergent from the nearest model. We found that hybrid enrichment using conserved probes (anchored enrichment) can recover a large number of unlinked loci that are useful at a diversity of phylogenetic timescales. This new approach has the potential to not only expedite resolution of deep-scale portions of the Tree of Life but also to greatly accelerate resolution of the large number of shallow clades that remain unresolved. The combination of low cost (~1% of the cost of traditional Sanger sequencing and ~3.5% of the cost of high-throughput amplicon sequencing for projects on the scale of 500 loci x 100 individuals) and rapid data collection (~2 weeks of laboratory time) are expected to make this approach tractable even for researchers working on systems with limited or non-existent genomic resources.