Skip to main content

Data from: Skyline fossilized birth-death model is robust to violations of sampling assumptions in total-evidence dating

Cite this dataset

Zhang, Chi; Ronquist, Fredrik; Stadler, Tanja (2023). Data from: Skyline fossilized birth-death model is robust to violations of sampling assumptions in total-evidence dating [Dataset]. Dryad.


Several total-evidence dating studies under the fossilized birth-death (FBD) model have produced very old age estimates, which are not supported by the fossil record. This phenomenon has been termed "deep root attraction (DRA)". For two specific datasets, involving divergence time estimation for the early radiations of ants, bees and wasps (Hymenoptera) and of placental mammals (Eutheria), it has been shown that the DRA effect can be greatly reduced by accommodating the fact that extant species in these trees have been sampled to maximize diversity, so called diversified sampling. Unfortunately, current methods to accommodate diversified sampling only consider the extreme case where it is possible to identify a cut-off time such that all splits occurring before this time are represented in the sampled tree but none of the younger splits. In reality, the sampling bias is rarely this extreme, and may be difficult to model properly. Similar modeling challenges apply to the sampling of the fossil record. This raises the question of whether it is possible to find dating methods that are more robust to sampling biases. Here, we show that the skyline FBD (SFBD) process, where the diversification and fossil-sampling rates can vary over time in a piecewise fashion, provides age estimates that are more robust to inadequacies in the modeling of the sampling process and less sensitive to DRA effects. In the SFBD model we consider, rates in different time intervals are either considered to be independent and identically distributed, or assumed to be autocorrelated following an Ornstein-Uhlenbeck (OU) process. Through simulations and reanalyses of the Hymenoptera and Eutheria data, we show that both variants of the SFBD model unify age estimates under random and diversified sampling assumptions. The SFBD model can resolve DRA by absorbing the deviations from the sampling assumptions into the inferred dynamics of the diversification process over time. Although this means that the inferred diversification dynamics must be interpreted with caution, taking sampling biases into account, we conclude that the SFBD model represents the most robust approach available currently for addressing DRA in total-evidence dating.

Usage notes

sfbd_ou_sup.docx: supplementary tables and figures
simulations: results under different simulation settings
hymenoptera: xml and tree files by analyzing the hymenoptera data 
eutheria: xml and tree files by analyzing the placental mammal data