Rapid evolutionary radiations are expected to require large amounts of sequence data to resolve. To resolve these types of relationships many systematists believe that it will be necessary to collect data by next-generation sequencing (NGS) and use multispecies coalescent (“species tree”) methods. Ultraconserved element (UCE) sequence capture is becoming a popular method to leverage the high throughput of NGS to address problems in vertebrate phylogenetics. Here we examine the performance of UCE data for gallopheasants (true pheasants and allies), a clade that underwent a rapid radiation 10–15 Ma. Relationships among gallopheasant genera have been difficult to establish. We used this rapid radiation to assess the performance of species tree methods, using ∼600 kilobases of DNA sequence data from ∼1500 UCEs. We also integrated information from traditional markers (nuclear intron data from 15 loci and three mitochondrial gene regions). Species tree methods exhibited troubling behavior. Two methods [Maximum Pseudolikelihood for Estimating Species Trees (MP-EST) and Accurate Species TRee ALgorithm (ASTRAL)] appeared to perform optimally when the set of input gene trees was limited to the most variable UCEs, though ASTRAL appeared to be more robust than MP-EST to input trees generated using less variable UCEs. In contrast, the rooted triplet consensus method implemented in Triplec performed better when the largest set of input gene trees was used. We also found that all three species tree methods exhibited a surprising degree of dependence on the program used to estimate input gene trees, suggesting that the details of likelihood calculations (e.g., numerical optimization) are important for loci with limited phylogenetic information. As an alternative to summary species tree methods we explored the performance of SuperMatrix Rooted Triple - Maximum Likelihood (SMRT-ML), a concatenation method that is consistent even when gene trees exhibit topological differences due to the multispecies coalescent. We found that SMRT-ML performed well for UCE data. Our results suggest that UCE data have excellent prospects for the resolution of difficult evolutionary radiations, though specific attention may need to be given to the details of the methods used to estimate species trees.
Supplementary Figure S1
FIGURE S1. Example of a UCE alignment showing a short mismatch at one end. This portion of the alignment shows no variation with the exception of a stretch of 13 nucleotides with nine mismatches. Similar stretches are evident in 69 UCE alignments. The terminal gaps in three other taxa simply reflect the failure of those assemblies to extend all of the way to the end of the alignment.
Figure S1 Meiklejohn et al.pdf
Supplementary Figure S2
FIGURE S2. Number of variable (left) and parsimony informative (right) sites present in each UCE locus
Figure S2 Meiklejohn et al.pdf
Supplementary Figure S3
FIGURE S3. Performance of the maximum parsimony (MP) criterion for phylogenetic estimation with UCE data. The topology of the single most parsimonious tree of 50929 steps recovered when UCE data were analyzed (A) was identical to the ML tree for those data. Minimum branch lengths for UCEs (i.e., the number of unambiguous synapomorphies) are shown above branches and bootstrap support is presented below branches. An MP analysis of combined UCEs+introns+mitochondrial data also yielded a single tree (of 59964 steps) with an identical topology. Bootstrap support for both analyses is reported as a percentage of 500 replicates. Full (100%) support is indicated using an asterisk (*). MP and ML branch lengths for the UCE data were strongly correlated (B), suggesting that the MP criterion is likely to have reconstructed character state transformations accurately. The graph was limited to internal branches.
Figure S3 Meiklejohn et al.pdf
Supplementary Figure S4
FIGURE S4. Estimates of gene trees for individual UCEs are poorly resolved. Box plot showing the number of resolved branches when those branches with <50% support in ML bootstrap analyses (using the indicated programs) or posterior probabilities <0.5 (using MrBayes). Similar data are shown for the optimal tree identified using GARLI with very short branches collapsed to form a polytomy (‘GARLI-opt’). Numbers of parsimony informative sites per UCE are used to define the sets of UCEs considered (‘all’ indicates that all UCEs with at least one parsimony informative site were included).
Figure S4 Meiklejohn et al.pdf
Supplementary File S1
Microsoft excel file including the unedited UCE locus lengths
Supplementary File S2
Microsoft excel file including the model fit information for individual UCE loci (after editing)
Supplementary File S3
Microsoft excel file including model fit information for traditional markers (introns and mitochondrial gene regions)
Supplementary File S4
pdf with the MRE (‘greedy’) consensus trees of UCE gene tree estimates. UCE gene trees used to generate the MRE consensus have polytomies. The polytomies reflect branches with <50% bootstrap support (RAxML, GARLI, and PhyML), posterior probability <0.5 (MrBayes), or short branches (GARLI optimal trees).
Supplementary File S5
pdf with MP-EST bootstrap trees for UCEs
Supplementary File S6
pdf with ASTRAL bootstrap trees for UCEs
Supplementary File S7
pdf with Triplec (rooted triple consensus) trees for UCEs. UCE gene trees used to generate the Triplec trees have polytomies. The trees are the same as those used to generate the MRE consensus trees in file S4.
Supplementary File S8
pdf with SMRT-ML trees for the UCE data
Supplementary File S9
pdf with the sources of images used in Figure 6 of Meiklejohn et al. "Analysis of a rapid evolutionary radiation using ultraconserved elements (UCEs): Evidence for a bias in some multi-species coalescent methods"
Meiklejohn et al. UCE alignments, unedited
gzipped tar file containing 1479 individual nexus files, each with a UCE alignment
Meiklejohn-et-al-UCE-data-unedited.tar.gz
Meiklejohn combined evidence data file
gzipped nexus file with all data (after the editing described in the text). All of the individual UCEs, nuclear introns, and mitochondrial gene regions are indicated using CHARSETS. Partitioning schemes identified using PartitionFinder are also included.
Meiklejohn-et-al-combined-evidence.nex.gz
SMRT-raxMRP perl script
perl script for the SMRT-ML analysis
SMRT-raxMRP.pl