Branching patterns in phylogenies cannot distinguish diversity-dependent diversification from time-dependent diversification

Pannetier, Theo 1 ; Martinez, César2 ; Bunnefeld, Lynsey3 ; Etienne, Rampal S.1

Published Oct 20, 2020 on Dryad. https://doi.org/10.5061/dryad.1jwstqjsx

Abstract

One of the primary goals of macroevolutionary biology has been to explain general trends in long-term diversity patterns, including whether such patterns correspond to an up-scaling of processes occurring at lower scales. Reconstructed phylogenies often show decelerated lineage accumulation over time. This pattern has often been interpreted as the result of diversity-dependent diversification, where the accumulation of species causes diversification to decrease through niche filling. However, other processes can also produce such a slowdown, including time-dependence without diversity-dependence. To test whether phylogenetic branching patterns can be used to distinguish these two mechanisms, we formulated a time-dependent, but diversity-independent model that matches the expected diversity through time of a diversity-dependent model. We simulated phylogenies under each model and studied how well likelihood methods could recover the true diversification mode. Standard model selection criteria always recovered diversity-dependence, even when it was not present. We correct for this bias by using a bootstrap method and find that neither model is decisively supported. This implies that the branching pattern of reconstructed trees contains insufficient information to detect the presence or absence of diversity-dependence. We advocate that tests encompassing additional data, e.g., traits or range distributions, are needed to evaluate how diversity drives macroevolutionary trends.

For more details please refer to the Methods section in the main text.

Simulation procedure

We simulated sets of 1,000 diversity-dependent (DD) and time-dependent (TD) phylogenetic trees using functions dd_sim and td_sim, respectively, from the R package DDD 4.3 (Etienne et al. 2012), available to download from CRAN

The two models share the same set of 3 parameters:

- λ₀(initial speciation rate)

- μ₀(constant extinction rate)

- K (carrying capacity)

We set λ₀=0.8 and K=40, and considered four levels of extinction (μ₀=0.1, 0.2, 0.3 or 0.4), and four different crown ages, or simulation times: 5, 10, 15, and 60 myr.

We then included an additional set with K = 80, and age = 15 myr.

Model selection

We fitted the DD and TD models on each DD or TD simulated tree to study whether phylogenetic trees generated by either model are indeed best fit by the model that generated them, or whether both models fit the data.

We used a maximum likelihood method to obtain the log-likelihood ratio (LLR) of DD versus TD for each tree.

The computation of both likelihoods (DD and TD) are implemented in functions dd_loglik and bd_loglik, respectively, in R package DDD 4.3.

We used the optimization routine implemented respectively in R functions dd_ML (DD) and bd_ML (TD) of the same package.

Initial parameter values were set to the true values to ensure relatively fast convergence of the likelihoods. Convergence however sometimes proved difficult, for example for large trees (i.e. more than a hundred tips), because the computation of the TD likelihood became challenging for trees of this size, and because of the presence of local optima in the likelihood landscape. In these cases, we initialized the optimization with a different value of K (the most influential parameter for the likelihood). First, TD trees were often larger than the carrying capacity would allow in DD. In instances where N > K', the likelihood of either model becomes 0 and we instead set the initial value of K to N' = N (λ_{_₀} - μ_{₀) /}λ_{_₀}. Second, to avoid local optima, we started the optimization at K = N, which we had observed to often be close to the maximum likelihood estimate for other trees.

Empirical phylogenies

We applied the simulation-optimisation procedure described above to a set of empirical, recosntructed phylogenetic trees.

We took the set of Tetrapod family-level phylogenies compiled from published literature by Condamine et al. (2019) and selected five groups for which the linear diversity-dependent model with constant extinction (that is, the DD model we used for simulations) fitted best out of 26 birth-death models. The five groups included three bird families, Parulidae, Bucerotidae and Indicatoridae, and two mammal phylogenies, Canidae and Pseudocheiridae. Bird phylogenies were assembled by Condamine et al. from the bird phylogeny published by Jetz et al. (2012); and mammal phylogenies were pruned from the mammalian tree of Rolland et al (2014), itself built from the tree of Bininda-Edmonds et al. (2007).

For each group, we extracted the estimated parameter values for the DD model reported in Condamine et al. (2019) and used these as a starting point for fitting the TD model to each phylogeny. We then obtained the LLR distribution for each model by simulating 1000 DD and TD trees from the corresponding parameter estimates, and fitting both models to each simulated tree. We computed the decision thresholds as described in 2.4 and compared the LLR obtained for the original phylogeny to decide if DD, TD, or neither, could be selected.

Branching patterns in phylogenies cannot distinguish diversity-dependent diversification from time-dependent diversification

Data files

Abstract

Methods

Simulation procedure

Model selection

Empirical phylogenies

Usage notes

Works referencing this dataset