Data from: Practical guide and review of fossil tip-dating in phylogenetics
Data files
Jan 09, 2026 version files 112.20 KB
-
README.md
4.35 KB
-
Supplementary_Table_S1.xlsx
107.85 KB
Abstract
Phylogenetic tip-dating has been and still is revolutionizing evolutionary biology in several ways. Tip-dating, where fossils are placed into a phylogeny as tips based on molecular and/or morphological character information, provides a more principled approach to infer time-calibrated phylogenies compared with node-dating. Additionally, phylogenetic trees with fossils as tips become more and more important to elucidate evolutionary processes in macroevolutionary studies, e.g., deciphering diversification patterns and directional phenotypic evolution. Tip-dating is slowly gathering popularity in empirical applications and has progressed substantially since its first demonstration in 2011, with respect to improved statistical models, software, and datasets. Nevertheless, executing a phylogenetic tip-dating analysis is complicated and comes with many challenges. Here, we provide an extensive review and overview of methods and models for phylogenetic tip-dating analyses. We focus both on data collection and preparation as well as on modeling choices. We start with a survey of all published phylogenetic tip-dating studies to date, showing common data and modeling choices as well as trends towards new approaches. Then, we walk readers through sections of molecular evolution, morphological evolution (both for discrete and continuous data), and lineage evolution (the fossilized-birth-death process). In each section, we describe the data and standard models with their underlying assumptions, and provide an outlook and practical recommendations.
https://doi.org/10.5061/dryad.gb5mkkx03
Description of the data and file structure
Supplementary_Table_S1.xlsx
This file contains a literature survey of all the tip-dating studies to date. Additional information about scoring for missing (?) and not applicable (NA) entries are explained below.
The spreadsheet contains 35 columns which are as follows:
- Publication Name [author*s (year)]" - The title of the publication.
- Publication Link - Online link to the publication in the publisher's website.
- Journal - The journal where the paper is published.
- Year - Published year
- Software - Software used for the tip-dating study.
- Kingdom - Kingdom of taxa under study.
- Category - Category of organisms under study. E.g., Amphibians, Mammals, Reptiles etc.
- Focal taxon - Specific taxon under study.
- NTaxa (total) - Total number of taxa in the study.
- NTaxa (morpho) - Number of taxa with morphological character data.
- NTaxa (mol) - Number of taxa with molecular sequence data.
- NTaxa (mol+morph overlap) - Number of taxa with both molecular sequence data and morphological character data.
- NTaxa (fossil) - Number of fossils included in the study.
- N. morphological characters - Number of morphological characters in the morphological data matrix.
- Percent missing morpho - Percentage of missing or not applicable morphological characters in taxa with morphological data.
- Max N. of morphological character states - Number of maximal value for morphological character state
- Continuous characters? - Usage of continuous characters in the study
- Morpho substitution model - Use of morphological substitution model in the study
- Ordered morpho characters? - Use of ordered characters in modeling morphological evolution.
- Morpho state frequencies - Whether the state frequencies were estimated or set as equal.
- Morpho partition - Partitioning approach for the morphological data
- Morpho branch lengths across partitions - Linked vs unlinked partitions; applicable only to partitioned datasets.
- N. molecular characters - Length of molecular sequence used in the study.
- N. molecular partitions - Number of subsets of molecular data used.
- Molecular data type - Type of molecular data used (DNA or Amino Acid)
- Molecular branch lengths across partitions - Linked vs unlinked; applicable only to partitioned datasets
- Tree prior - Type of tree prior distribution was used in the study.
- Molecular clock model - Clock model applied to molecular data
- Morphological clock model - Clock model applied to morphological data
- Clock partitions - Is every data subset modeled with its own clock?
- Sampled ancestors? - Allowance of sampled ancestors.
- Fossil tip age prior - What prior is specified for the fossil tip.
- Topological constraints - Use of topological constrains in the analyses.
- Root age prior - What prior is specified for the root age.
- Additional age constraints - Use of additional node calibrations
Not Applicable (NA)
Columns related to molecular data (e.g., N. molecular characters, Molecular clock model) will be empty for studies that used only morphological data.
Columns related to data partitioning (e.g., Morpho branch lengths across partitions) will be empty if the study did not partition its data.
The Morphological clock model column will be empty if the study did not include morphological data in the clock model analysis.
Not Available ?
In general, '?' in a cell means that the information could not be retrieved from the study.
For example, a cell in the Percent missing morpho column might be empty because, although the study used a morphological matrix, the authors did not report the percentage of missing data.
Similarly, a cell in the Root age prior column might be empty if the authors did not explicitly state the prior used for the root age.
In all cases, a blank cell means the information could not be coded from the source publication, and users of this dataset should treat it as a missing value (NA) in their analyses. A value of zero (0) is used only when it is an explicitly reported quantity (e.g., zero fossil taxa).
Software/Code
N/A
We conducted a literature survey in order to gather information about tip-dating analyses on empirical datasets. We used Google Scholar to look at all publications citing a selection of foundational tip-dating studies, and among those, we identified publications performing tip-dating analyses on empirical datasets. We compiled relevant information---including software of choice, dataset size, types of data used, and clock and tree models employed. When it was possible, this information was extracted directly from data and script files provided as supplementary material of each publication; otherwise, we used the information written in the methodological sections of the main and supplementary texts.
While we originally included tip-dating studies without morphological data in which either the whole topology was fixed or topological constraints were applied to extinct taxa, we pruned those studies from our survey before downstream analyses for summary figures and statistics. This is because we wanted to focus on studies that use tip-dating as an analytical approach to co-estimate the topological position of extinct taxa with divergence times, rather than just estimating divergence times on an a priori fixed topology, or pruning extinct tips afterwards to obtain a dated phylogenetic tree of only extant species.
