Data from: Enriching the ant tree of life: enhanced UCE bait set for genome-scale phylogenetics of ants and other Hymenoptera
Branstetter, Michael G., University of Utah, Smithsonian Institution
Longino, John T., University of Utah
Ward, Philip S., University of California, Davis
Faircloth, Brant C., Louisiana State University of Alexandria
Published Feb 14, 2017 on Dryad.
Cite this dataset
Branstetter, Michael G.; Longino, John T.; Ward, Philip S.; Faircloth, Brant C. (2017). Data from: Enriching the ant tree of life: enhanced UCE bait set for genome-scale phylogenetics of ants and other Hymenoptera [Dataset]. Dryad. https://doi.org/10.5061/dryad.89n87
1. Targeted enrichment of conserved genomic regions (e.g., ultraconserved elements or UCEs) has emerged as a promising tool for inferring evolutionary history in many organismal groups. Because the UCE approach is still relatively new, much remains to be learned about how best to identify UCE loci and design baits to enrich them.
2. We test an updated UCE identification and bait design workflow for the insect order Hymenoptera, with a particular focus on ants. The new strategy augments a previous bait design for Hymenoptera by (a) changing the parameters by which conserved genomic regions are identified and retained, and (b) increasing the number of genomes used for locus identification and bait design. We perform in vitro validation of the approach in ants by synthesizing an ant-specific bait set that targets UCE loci and a set of “legacy” phylogenetic markers. Using this bait set, we generate new data for 84 taxa (16/17 ant subfamilies) and extract loci from an additional 17 genome-enabled taxa. We then use these data to examine UCE capture success and phylogenetic performance across ants. We also test the workability of extracting legacy markers from enriched samples and combining the data with published data sets.
3. The updated bait design (hym-v2) contained a total of 2,590-targeted UCE loci for Hymenoptera, significantly increasing the number of loci relative to the original bait set (hym-v1; 1,510 loci). Across 38 genome-enabled Hymenoptera and 84 enriched samples, experiments demonstrated a high and unbiased capture success rate, with the mean locus enrichment rate being 2,214 loci per sample. Phylogenomic analyses of ants produced a robust tree that included strong support for previously uncertain relationships. Complementing the UCE results, we successfully enriched legacy markers, combined the data with published Sanger data sets, and generated a comprehensive ant phylogeny containing 1,060 terminals.
4. Overall, the new UCE bait design strategy resulted in an enhanced bait set for genome-scale phylogenetics in ants and likely all of Hymenoptera. Our in vitro tests demonstrate the utility of the updated design workflow, providing evidence that this approach could be applied to any organismal group with available genomic information.
Raw Illumina reads (SRA PRJNA360290).
Raw Illumina reads for all enriched samples included in this study. Data available from the NCBI Sequence Read Archive using BioProject accession PRJNA360290.
Alignment supermatrices and data partitioning files
Alignment supermatrices and associated data partitioning files for all analyses performed in this study. Includes 101-sample and 1060-sample data sets. Includes PartitionFinder configuration file for hcluster analysis of UCE loci.
Tables and table captions - Excel
All tables and table captions from the main text and supporting information of this study. Provided in Excel format.
Appendix 2 from study.
Appendix 2 from study. Includes supplementary methods, table captions, figures, and references.
Hymenoptera v2 bait set files. Includes the principal and ant-specific UCE bait set files, and the ant-exon bait set files.
All phylogenetic trees from study in PDF and either Newick or Nexus formats. Includes 101-terminal and 1060-terminal trees.
R script to calculate locus stats.
R script used to calculate mean gene-tree bootstrap and mean GC-variance among taxa for a given set of gene trees and associated alignment files.
UCE contigs - aligned, trimmed, and filtered (Ants101T-F90).
All UCE contigs (aligned and trimmed) for the Ants101T-F90 data set. Contigs aligned with MAFFT and trimmed with Gblocks. Alignments filtered to include only those alignments with at least 90% taxon occupancy.