Skip to main content
Dryad logo

Data from: How to date a crocodile – estimation of neosuchian clade ages and a comparison of four time-scaling methods


Groh, Sebastian; Upchurch, Paul; Barrett, Paul M.; Day, Julia J. (2022), Data from: How to date a crocodile – estimation of neosuchian clade ages and a comparison of four time-scaling methods, Dryad, Dataset,


Clade ages within the crocodylomorph clade Neosuchia have long been debated. Molecular and morphological studies have yielded remarkably divergent results. Despite recent advances, there has been no comprehensive relative comparison of the major time calibration methods available to estimate clade ages based on morphological data. We used four methods (cal3, Extended Hedman [EH], smoothed Ghost-Lineage-Analysis [sGLA] and the Fossilised Birth-Death model [FBD]) to date clade ages derived from a published crocodylomorph supertree and a new neosuchian phylogeny. All time-scaling methods applied here agree on the origination of Neosuchia during the Late Triassic/Early Jurassic, and the presence of the major extant eusuchian groups (Crocodyloidea, Gavialoidea, Alligatoroidea, and Caimaininae) by the end of the Late Cretaceous. The number of distinct lineages present before the K/Pg boundary is less certain, with support for two competing scenarios in which Crocodylinae, Tomistominae and Diplocynodontinae either: 1) diverged from other eusuchian lineages before the K/Pg boundary; or 2) evolved during a ‘burst’ of diversification after the K/Pg event. Cal3 and FBD are identified as the most suitable methods for time-scaling phylogenetic trees dominated by fossil taxa. Extended Hedman estimates are substantially older than the others, with larger standard deviations and a strong vulnerability to taxon sampling and topological changes. sGLA has similar problems and cannot be recommended either. We conclude that a detailed understanding of phylogenetic relationships, tree reconstruction methods, and good taxonomic coverage (in particular the inclusion of the oldest taxon in each clade) is essential when evaluating the results of such dating analyses.


Dataset contains:

1. Discretised and rediscretised versions of the character matrix, collected in person by personal scoring of museum specimens by the authors and additional information in the literature (based on Groh et al. 2020). Character lists that the character scores are based on are also given.

2. Analysis protocol for the datasets, to be used in TNT.

3. Raw data for time-scaling analysis - all taxa in their tree together with the associated age information, plus raw R and .xml files. Collected via literature review and PBDB.

4. The R code used for analysis, based on earlier work by other authors (cited in main paper) and code written by the authors.

5. Raw results for the time-scaling analysis, detailing the age of each named node obtained by each dating method for each tree.

Usage Notes

File structure, file name and file contents are explained in detail in the file Groh_2020_croc_times_README.txt.


Natural Environment Research Council, Award: NE/L002485/1