Data from: A comparison of supermatrix and supertree methods for multilocus phylogenetics using organismal datasets
Janies, Daniel A.; Studer, Jonathon; Handelman, Samuel K.; Linchangco, Gregorio (2013), Data from: A comparison of supermatrix and supertree methods for multilocus phylogenetics using organismal datasets, Dryad, Dataset, https://doi.org/10.5061/dryad.2fs2b
It has been proposed that supertree approaches should be applied to large multilocus sequence datasets to achieve computational tractability. Large datasets such as those derived from phylogenomics studies can be broken into many locus-specific tree searches and the resulting trees can be stitched together via a supertree method. Using simulated data, workers have reported that they can rapidly construct a supertree that is comparable to the results of heuristic tree search on the entire dataset. To test this assertion with organismal data, we compared tree length under the parsimony criterion and computational time for twenty multilocus datasets using supertree (SuperFine and SuperTriplets) and supermatrix (heuristic search in TNT) approaches. Tree length and computational times were compared among methods using the Wilcoxon matched-pairs signed rank test. Supermatrix searches produce significantly shorter trees than either supertree approach (SuperFine or SuperTriplets; p < 0.0002 in both cases). Moreover, the processing time of supermatrix search was significantly lower than SuperFine+locus-specific search (p < 0.01) but roughly equivalent to that of SuperTriplets+locus-specific search (p > 0.4, not significant). In conclusion, we show by using real rather than simulated data, that there is no basis, either in time tractability or tree length, for use of supertrees over heuristic tree search using a supermatrix for phylogenomics.