Data from: SuperFine: fast and accurate supertree estimation

Swenson, M. Shel1; Suri, Rahul1; Linder, C. Randal1; Warnow, Tandy1

Published May 16, 2011 on Dryad. https://doi.org/10.5061/dryad.879st

Data files

May 16, 2011 version files 1.85 MB

README_for_SuperFineSource.txt

2.67 KB
SuperFineSource.zip

1.85 MB

Abstract

Many research groups are estimating trees containing anywhere from a few thousand to hundreds of thousands of species, towards the eventual goal of the estimation of a Tree of Life, containing perhaps as many as several million leaves. These phylogenetic estimations present enormous computational challenges, and current computational methods are likely to fail to run even on datasets in the low end of this range. One approach to estimate a large species tree is to use phylogenetic estimation methods (such as maximum likelihood) on a supermatrix produced by concatenating multiple sequence alignments for a collection of markers; however, the most accurate of these phylogenetic estimation methods are extremely computationally intensive for datasets with more than a few thousand sequences. Supertree methods, which assemble phylogenetic trees from a collection of trees on subsets of the taxa, are important tools for phylogeny estimation where phylogenetic analyses based upon maximum likelihood are infeasible. In this paper, we introduce SuperFine, a meta-method that utilizes a novel two-step procedure in order to improve the accuracy and scalability of supertree methods. Our study, using both simulated and empirical data, shows that SuperFine-boosted supertree methods produce more accurate trees than standard supertree methods, and run quickly on very datasets with thousands of sequences. Furthermore, SuperFine-boosted MRP (Matrix Representation with Parsimony, the most well known supertree method) approaches the accuracy of maximum likelihood methods on supermatrix datasets under realistic conditions.

Data from: SuperFine: fast and accurate supertree estimation

Data files

Abstract

Usage notes

SuperFineSource

Works referencing this dataset