Data from: Inferring complex phylogenies using parsimony: an empirical approach using three large DNA data sets for angiosperms
Soltis, Douglas E. et al. (2008), Data from: Inferring complex phylogenies using parsimony: an empirical approach using three large DNA data sets for angiosperms, Dryad, Dataset, https://doi.org/10.5061/dryad.64
To explore the feasibility of parsimony analysis for large data sets, we conducted heuristic parsimony searches and bootstrap analyses on separate and combined DNA data sets for 190 angiosperms and three outgroups. Separate data sets of 18S rDNA (1,855 bp), rbc L (1,428 bp), and atp B (1,450 bp) sequences were combined into a single matrix 4,733 bp in length. Analyses of the combined data set show great improvements in computer run times compared to those of the separate data sets and of the data sets combined in pairs. Six searches of the 18S rDNA rbc L atp B data set were conducted; in all cases TBR branch swapping was completed, generally within a few days. In contrast, TBR branch swapping was not completed for any of the three separate data sets, or for the pairwise combined data sets. These results illustrate that it is possible to conduct a thorough search of tree space with large data sets, given sufficient signal. In this case, and probably most others, sufficient signal for a large number of taxa can only be obtained by combining data sets. The combined data sets also have higher internal support for clades than the separate data sets, and more clades receive bootstrap support of 50% in the combined analysis than in analyses of the separate data sets. These data suggest that one solution to the computational and analytical dilemmas posed by large data sets is the addition of nucleotides, as well as taxa.