Computing tree size under dynamical models of diversification
Data files
Dec 18, 2024 version files 8.85 MB
-
README.md
1.25 KB
-
SupMatA_DensityDependence.nb
3.54 MB
-
SupMatA_DensityDependence.pdf
2.26 MB
-
SupMatB_Congruence.nb
2.68 MB
-
SupMatB_Congruence.pdf
371.43 KB
Abstract
A phylogenetic tree has three types of attributes: size, shape (topology), and branch lengths. Phylodynamic studies are often motivated by questions regarding the size of clades, nevertheless, nearly all of the inference methods only make use of the other two attributes. In this paper, we ask whether there is additional information if we consider tree size more explicitly in phylodynamic inference methods. To address this question, we first needed to be able to compute the expected tree size distribution under a specified phylodynamic model; perhaps surprisingly, there is not a general method for doing so – it is known what this is under a Yule or constant rate birth-death model but not for the more complicated scenarios researchers are often interested in. We present three different solutions to this problem: using i) the deterministic limit; ii) master equations; and iii) an ensemble moment approximation. Using simulations, we evaluate the accuracy of these three approaches under a variety of scenarios and alternative measures of tree size (i.e., sampling through time or only at the present; sampling ancestors or not). We then use the most accurate measures for the situation, to investigate the added informational content of tree size. We find that for two critical phylodynamic questions – i) is diversification diversity dependent? and, ii) can we distinguish between alternative diversification scenarios? – knowing the expected tree size distribution under the specified scenario provides insights that could not be gleaned from considering the expected shape and branch lengths alone. The contribution of this paper is both a novel set of methods for computing tree size distributions and a path forward for richer phylodynamic inference into the evolutionary and epidemiological processes that shape lineage trees.
README: Data from: The Untapped Potential of Tree Size in Reconstructing Evolutionary and Epidemiological Dynamics
https://doi.org/10.5061/dryad.fn2z34v3w
Mathematica notebooks for calculating the distribution of tree size across diversification models (SupMatA_DiversityDependence) and for comparing the distribution of tree sizes across congruent model scenarios (SupMatB_Congruence). Notebook files (.nb) files were written in written in Wolfram Mathematica version 14.1.0.0. A PDF printout of each file is also included for users without the proprietary software.
In SupMatA we first provide code for simulating phylogenetic trees under a time-dependent and density-dependent diversification model. For each specific diversification model (Exponential, Logistic, and SIR) we then calculate the distribution of tree sizes using various approaches (e.g., deterministic approximation, master equations, or ensemble moment approximation) and compare the approximate distribution to simulated distributions.
In SupMatB we compare and distribution of tree sizes for two congruent model scenarios. This requires first constructing congruent models and then evaluating the distribution of tree sizes.