Data from: Phylogenies and diversification rates: variance cannot be ignored
Cite this dataset
Rabosky, Daniel L (2018). Data from: Phylogenies and diversification rates: variance cannot be ignored [Dataset]. Dryad. https://doi.org/10.5061/dryad.61jn743
A recent pair of articles published in the journal Evolution presented a test for assessing the validity of hierarchical macroevolutionary models. The premise of the test is to compare numerical point estimates of parameters from two levels of analysis; if the estimates differ, the hierarchical model is purportedly flawed. The articles in question (Meyer and Wiens 2017; Meyer et al. 2018) apply their proposed test to BAMM, a scientific software program that uses a Bayesian mixture model to estimate rates of evolution from phylogenetic trees. The authors use BAMM to estimate rates from large phylogenies (n > 60 tips) and they apply the method separately to subclades within those phylogenies (median size: n = 3 tips); they find that point estimates of rates differ between these levels and conclude that the method is flawed, but they do not test whether the observed differences are statistically meaningful. There is no consideration of sampling variation and its impact at any level of their analysis. Here, I show that numerical differences across groups that they report are fully explained by a failure to account for sampling variation in their point estimates. Variance in evolutionary rate estimates – from BAMM and all other methods – is an inverse function of clade size; this variance is extreme for clades with 5 or fewer tips (e.g., 70% of clades in the focal study). The articles in question rely on negative results that are easily explained by low statistical power to reject their preferred null hypothesis, and this low power is a trivial consequence of high variance in their point estimates. I describe additional mathematical and statistical mistakes that render the proposed testing framework invalid on first principles. Evolutionary rates are no different than any other population parameters we might wish to estimate, and biologists should use the training and tools already at their disposal to avoid erroneous results that follow from the neglect of variance.