Phylogenetic relationships in recent, rapid radiations can be difficult to resolve due to incomplete lineage sorting and reliance on genetic markers that evolve slowly relative to the rate of speciation. By incorporating hundreds to thousands of unlinked loci, phylogenomic analyses have the potential to mitigate these difficulties. Here, we attempt to resolve phylogenetic relationships among eight shrew species (genus Crocidura) from the Philippines, a phylogenetic problem that has proven intractable with small (< 10 loci) data sets. We sequenced hundreds of ultraconserved elements and whole mitochondrial genomes in these species and estimated phylogenies using concatenation, summary coalescent, and hierarchical coalescent methods. The concatenated approach recovered a maximally supported and fully resolved tree. In contrast, the coalescent-based approaches produced similar topologies, but each had several poorly supported nodes. Using simulations, we demonstrate that the concatenated tree could be positively misleading. Our simulations also show that the tree shape we tend to infer, which involves a series of short internal branches, is difficult to resolve, even if substitution models are known and multiple individuals per species are sampled. As such, the low support we obtained for backbone relationships in our coalescent-based inferences reflects a real and appropriate lack of certainty. Our results illuminate the challenges of estimating a bifurcating tree in a rapid and recent radiation, providing a rare empirical example of a nearly simultaneous series of speciation events in a terrestrial animal lineage as it spreads across an oceanic archipelago.
Online Appendix 1 - SNP Analysis Pipeline
Text file containing commands and software used to obtain SNPs from our UCE reads.
Online Appendix 2 - Locus Info
Excel file containing information on each UCE locus used in this study. This information includes: locus length, number of informative sites, number of SNPs, and best fitting model. There are two sheets: one listing all of the loci included in our final dataset (1,112 loci) and another listing only the loci that had phylogenetically informative sites (919 loci).
Online Appendix 3 - BPP Results
Theta and tau estimates, and corresponding ESS values, from eight BP&P runs corresponding to eight subsets of the 919-locus dataset. These values were used to construct a species tree that was used as the "true" tree from which we simulated character data.
Figure S1 Smilogram
Figure S1. Scatter plot of the frequency of variant bases relative to the position of those sites in their alignments
Figure S2 Concatenated Full MrBayes
Figure S2. Non-condensed trees from concatenated analyses. Condensed trees are shown in Fig. 3 in the main text. Numbers at nodes indicate Bayesian posterior probabilities.
Figure S3 Empirical StarBEAST
Figure S3. All 19 empirical *BEAST subsets. Subsets 1–18 are based on random subsets of 50 loci each; subset 19 includes the remaining 19 loci not included in the other
subsets. Numbers at nodes denote Bayesian PPs. Black circles at nodes indicate PPs ≥ 0.95.
Figure S4 Sim-Matching StarBEAST
Figure S4. All 10 Sim-Matching *BEAST subsets.
Each species tree is based on analysis of a different
subset of 50 simulated loci. Numbers at nodes
denote Bayesian PPs.
Figure S5 Sim-Multi-Individual StarBEAST
Figure S5. All 10 Sim-Multi-Individual *BEAST
subsets. Each species tree is based on analysis of a
different subset of 50 simulated loci. Numbers at
nodes denote Bayesian PPs.
Figure S6 Sim-3x-Rate StarBEAST
Figure S6. All 10 Sim-3x-Rate *BEAST subsets.
Each species tree is based on analysis of a different
subset of 50 simulated loci. Numbers at nodes
denote Bayesian PPs.
1112 Empirical UCE Nexus Files
All 1,112 UCE nexus alignments.
new-incomplete-50percent-all-species-padded.zip
500 Sim-Matching simulated alignments
Fasta files for Sim-Matching simulation
500-sim-matching-fasta-files.zip
500 Sim-Multi-Individual simulated alignments
Fasta files for Sim-Multi-Individual simulation
500-sim-multi-individual-fasta-files.zip
500 Sim-3x-Rate simulated alignments
Fasta files for Sim-3x-Rate simulation
500-sim-3x-rate-fasta-files.zip
SNP Data
Nexus file containing data for 1,170 SNPs genotyped for all 19 individuals.
SNP-data.nex
All 919 Empirical Gene Trees
All 919 PhyML gene trees inferred using CloudForest. After collapsing short branches into polytomies, these trees were used to infer the species tree topologies used in MulRF and ASTRAL.
genetrees.tre
All Empirical Bootstrap Gene Trees
100 bootstrap replicates for 919 loci. Bootstrap replicates were made in CloudForest. Each of the 100 bootstrap sets were used to infer species trees. These pseudoreplicates were used to add nodal support metrics to empirical MulRF and ASTRAL trees.
100-bootreps.tree.zip
Whole Mitochondrial Genomes and MrBayes Block
Nexus file containing sequences for 19 whole mitochondrial genomes and a MrBayes block with MrBayes settings.
MrBayes-WMG-Input.nex