Data from: Modeling the evolution of rates of continuous trait evolution
Martin, Bruce; Bradburd, Gideon; Harmon, Luke; Weber, Marjorie (2023), Data from: Modeling the evolution of rates of continuous trait evolution, Dryad, Dataset, https://doi.org/10.5061/dryad.9ghx3ffkb
Rates of phenotypic evolution vary markedly across the tree of life, from the accelerated evolution apparent in adaptive radiations to the remarkable evolutionary stasis exhibited by so-called “living fossils”. Such rate variation has important consequences for large-scale evolutionary dynamics, generating vast disparities in phenotypic diversity across space, time, and taxa. Despite this, most methods for estimating trait evolution rates assume rates vary deterministically with respect to some variable of interest or change infrequently during a clade’s history. These assumptions may cause underfitting of trait evolution models and mislead hypothesis testing. Here, we develop a new trait evolution model that allows rates to vary gradually and stochastically across a clade. Further, we extend this model to accommodate generally decreasing or increasing rates over time, allowing for flexible modeling of “early/late bursts” of trait evolution. We implement a Bayesian method, termed “evolving rates” (evorates for short), to efficiently fit this model to comparative data. Through simulation, we demonstrate that evorates can reliably infer both how and in which lineages trait evolution rates varied during a clade’s history. We apply this method to body size evolution in cetaceans, recovering substantial support for an overall slowdown in body size evolution over time with recent bursts among some oceanic dolphins and relative stasis among beaked whales of the genus Mesoplodon. These results unify and expand on previous research, demonstrating the empirical utility of evorates.
For the empirical example analyzing cetacean body length data ("05_cet_prep_data.R" and "06_cet_figures.R"): the cetacean phylogeny is from a recent study (see Lloyd et al. 2021 and associated dryad repository). We used the tree "Cetacea_Safe_Extant_MCC.tre" located in the compressed file "Cetacean_Metatree_Data.zip" in "Cetacean Metatree Data/Time TreeInference/Safe". Most cetacean body length data comes from the "whales" dataset included in the R package geiger (after installing geiger, you can load this yourself using the code "data('whales')"). These data represent maximum adult female lengths in meters after a natural log transformation. All other cetacean body length data was found in various primary literature sources and hard-coded into an R script. When possible, we also based length data on maximum adult female lengths, though in some cases sex information was unavailable and/or all measured female specimens were juvenile. See Table S1 in the online appendix ("appendix.pdf") for more details on sources/caveats.
All other R scripts generate and analyze simulated data.
The README file contains more detailed information on scripts, folders, and file names, including the meaning of various notations and their associated inputs/outputs.
All relevant scripts/data are included in the compressed file "evorates.zip". Scripts are in the main folder of the compressed file and come with a numeric prefix corresponding to the order in which they should be run. Data, intermediate files (i.e., the output of a previous script that will be used as input for a later script), and figures/tables are organized into various folders generally relating to the corresponding part of the analysis (e.g., files beginning with "cet_" correspond to the empirical analysis; see Code/Software for more detailed information on which files each folder contains).
For convenience, the most processed representations of the simulated data and associated model fits ("inf.res.full", "tru.res.full", and "prior.sens.res.full"), Hamiltonian Monte Carlo samples/accompanying info for models fitted to the cetacean data ("cet_fits/cet_<xxx>" and "cet_fits/EB_cet_<xxx>" ), and results for the geometric Brownian Motion simulation experiments ("gbm_results/sim.GBM.avg.<xxx>" ), described under the section "Approximating Geometric Brownian Motion Time-Averages" in "appendix.pdf", are provided. Due to their large size, all other intermediate files must be regenerated and reprocessed by the appropriate R scripts (intermediate files are also available upon request to Bruce S. Martin at firstname.lastname@example.org).
Running these scripts requires the R package evorates, which is not yet available via CRAN. Evorates can be installed from github using the devtools package with the following line of code:
Reproducing this study in full will require one to change directories according to their specific setup and could take a very long time. Use of a high power computing cluster is recommended, especially for running scripts named "01_sim_study_script_<xxx>.R" or "11_prior_sens_study.R"
National Institute of General Medical Sciences, Award: R35GM137919
National Science Foundation, Award: DEB-1831164