Skip to main content
Dryad logo

DateLife: leveraging databases and analytical tools to reveal the dated Tree of Life

Citation

Sanchez Reyes, Luna Luisa; McTavish, Emily Jane; O'Meara, Brian (2022), DateLife: leveraging databases and analytical tools to reveal the dated Tree of Life, Dryad, Dataset, https://doi.org/10.5061/dryad.cnp5hqc6w

Abstract

Achieving a high-quality reconstruction of a phylogenetic tree with branch lengths proportional to absolute time (chronogram) is a difficult and time-consuming task. But the increased availability of fossil and molecular data, and time-efficient analytical techniques has resulted in many recent publications of large chronograms for a large number and wide diversity of organisms. Knowledge of the evolutionary time frame of organisms is key for research in the natural sciences. It also represent valuable information for education, science communication, and policy decisions. When chronograms are shared in public, open databases, this wealth of expertly-curated and peer-reviewed data on evolutionary timeframe is exposed in a programatic and reusable way, as intensive and localized efforts have improved data sharing practices, as well as incentivizited open science in biology. Here we present DateLife, a service implemented as an R package and an R Shiny website application available at www.datelife.org, that provides functionalities for efficient and easy finding, summary, reuse, and reanalysis of expert, peer-reviewed, public data on time frame of evolution. The main DateLife workflow constructs a chronogram for any given combination of taxon names by searching a local chronogram database constructed and curated from the Open Tree of Life Phylesystem phylogenetic database, which incorporates phylogenetic data from the TreeBASE database as well. We implement and test methods for summarizing time data from multiple source chronograms using supertree and congruification algorithms, and using age data extracted from source chronograms as secondary calibration points to add branch lengths proportional to absolute time to a tree topology. DateLife will be useful to increase awareness of the existing variation in alternative hypothesis of evolutionary time for the same organisms, and can foster exploration of the effect of alternative evolutionary timing hypotheses on the results of downstream analyses, providing a framework for a more informed interpretation of evolutionary results.

Methods

This dataset contains files, figures and tables from the two examples shown in the manuscript (small example and fringillidae example), as well as from the cross validation analysis performed.

Small example of the Datelife workflow. 1. Processed an input of 6 bird species within the Passeriformes (Pheucticus tibialis, Rhodothraupis celaeno, Emberiza citrinella, Emberiza leucocephalos, Emberiza elegans and Platyspiza crassirostris); 2. Used process names to search DateLife's chronogram database; 3. Summarized results from matching chronograms.

Fringillidae example: http://phylotastic.org/datelife/articles/fringiliidae.html

Cross validation: We performed a cross validation analysis of the DateLife workflow using 19 Fringillidae chronograms found in datelife's database. We used the individual tree topologies from each of the 19 source chronograms as inputs, treating their node ages as unknown. We then estimated dates for these topologies using node ages of chronograms belonging t o the remaining 12 studies as secondary calibrations, smoothingwith BLADJ.

Usage Notes

DateLife is a publlic and open source software https://github.com/phylotastic/datelife#readme

Funding

National Science Foundation, Award: ABI-1458603

National Science Foundation, Award: DBI-0905606

National Science Foundation, Award: ABI-145872

National Science Foundation, Award: ABI-1759846