Data supporting the study: Multiple dating methods and insights from the fossil record reveal that bark beetles predate the angiosperm terrestrial revolution
Data files
Mar 19, 2026 version files 53.31 MB
-
File_S1_S4.zip
53.31 MB
-
README.md
7.32 KB
Abstract
Estimating the time origin of a given clade is not trivial. Most dating studies rely on a node dating approach, which is highly sensitive to fossil placement uncertainties. A perfect example of that can be found in bark beetles (Curculionidae: Scolytinae). Molecular dating of the subfamily has mostly been anchored to a single fossil, †Cylindrobrotus pectinatus (125.77–121.4 Ma). Although this fossil is often used as a calibration point on the stem node of the subfamily, its phylogenetic position has never been rigorously tested through formal analyses, casting uncertainty over previous estimates. Here, we rely on total-evidence dating, integrating morphological and molecular data along with 19 well-preserved scolytine fossils, to refine the divergence time estimates of the subfamily. We additionally used a traditional node dating approach, as well as the Bayesian Brownian Bridge model, to independently analyze species-level fossil occurrences of Scolytinae and compared our results. Under the same priors and with †C. pectinatus included, the Bayesian Brownian Bridge produced ages in line with those of total-evidence dating. However, when †C. pectinatus is excluded, ages are later of about ~100 Ma than in total-evidence dating and node dating analyses. In our case, total-evidence with different clock parameters for each gene and the morphological data, appeared more appropriate than node dating for estimating the age of Scolytinae when using a phylogenetic framework. Additionally, our results suggest that †C. pectinatus is more closely related to Dryocoetini s. l. or Ipini–Dryocoetini s. l. clades rather than being a stem lineage of Scolytinae. Across all analyses, when †C. pectinatus is considered inside the subfamily, Scolytinae are constantly inferred to have originated at least 131.2 Ma, predating the Angiosperm Terrestrial Revolution. These results highlight the importance of integrating multiple dating approaches to mitigate biases inherent to any single method, ultimately leading to more reliable divergence estimates.
Title of the Dataset:
Corresponding author:
Name: Ferreira Jules
Institution: Institut Botànic de Barcelona (CSIC-CMCNB), 08038 Barcelona, Spain
Facultat de Ciències de la Terra, Universitat de Barcelona, 08028 Barcelona, Spain
Email: jules.ferreira27@gmail.com
File S1. Morphological and molecular matrices used in this study.
Description
Files containing molecular and morphological data used in phylogenetic analyses.
These data are derived from Jordal et al. (2011), Jordal and Cognato (2012) and Pistone et al. (2018). We also modified the morphological matrix by adding new taxa.
File S2. MrBayes input files and trees realised before total-evidence dated and node dated analyses in order to determine the best priors.
Description
These results were obtained before doing TED and ND analyses. They provide data essential when trying to find best priors for TED and ND analyses.
File S3. All total-evidence (without dating and with dating) and node dating input files used in this study with MrBayes, along with the trees obtained. RoguePlots results are available for all the fossils in all analyses.
Description
All the input files ready to be used in MrBayes software for all TED and ND analyses conducted in this study. Additionally it contains trees obtained from these analyses, and RoguePlots results allowing to visualize the uncertainty in fossil placements in analyses.
The file is organized with 6 subfiles: 2 containing scripts and trees from the total-evidence without dating (TE) analyses; 2 containing total-evidence dating (TED) analyses with either the gene-based partition scheme or the codon position partition scheme; and 2 containing node dating (ND) analyses with either the gene-based partition scheme or the codon position partition scheme.
File S4. Data and commands used in rootBBB, including the extant diversity table and the diversity table over time, organized into 1-million-year bins.
Description
The necessary lines of prompt commands to use with rootBBB in order to redo the analyses as done in this work. A table of fossil used with their age and references is also provided.
- Text file "Prompts.txt": Prompts used to run rootBBB model follow this structure "python3.12 rootBBB.py -fossil_data C:\RootBBB\Scolytinae\Fossil.txt -div_table C:\RootBBB\Scolytinae\diversity.txt -q_var 1 -clades 0 12 -n 1000000 -s 1000 -max_age 165 -out age_165".
- "python3.12" correspond to the programming language used to read the script "rootBBB.py" available in "https://github.com/dsilvestro/rootBBB".
- "-fossil_data" is an argument used to specify the table with fossil occurrences.
- "-div_table" is an argument used to specify the table with the extant species diversity for each of the studied clade.
- "-q_var 1" is an argument used to specify a preservation model with exponential time-increasing rate. "-clade" is an argument used to specify the the clades from the diversity and fossil tables that you want to analyse.
- "- n" is an argument used to specify the number of generations to use for the Bayesian analysis.
- "-s" is an argument used to specify the sampling frequency.
- "-max_age" is an argument allowing to specify a maximum age to run the analyses. Therfore age of the considered clade cannot go beyond this number.
- "-out" is an argument used to name the output files.
- Text file "Fossil.txt": This text file contains fossil occurrences of Scolytinae distributed by 1 million year bins.
- The first column refers to time in million years.
- The final number at the end of each column head names, i.e., "0", "1", "2" refers in this case to analyses where the number of extant species has been changes trhough analyses, but with the same distribution of fossils. Here "0" corresponds to an extant diversity set to 6,509; "1" to an extant diversity set to 8,000; and "2" to an extant diversity set to 10,000.
- The columns "Scolytinae_0_Cyl_Mic", "Scolytinae_1_Cyl_Mic", "Scolytinae_2_Cyl_Mic" corresponds to analyses where the fossils Cylindrobrotus pectinatus and Microborus inertus are considered in the analyses.
- The columns "Scolytinae_0_Cyl", "Scolytinae_1_Cyl", "Scolytinae_2_Cyl" corresponds to analyses where the fossil Cylindrobrotus pectinatus is included and the fossil Microborus inertus is excluded from analyses.
- The columns "Scolytinae_0_Mic", "Scolytinae_1_Mic", "Scolytinae_2_Mic" corresponds to analyses where the fossil Cylindrobrotus pectinatus is excluded and the fossil Microborus inertus is incldued in analyses.
- The columns "Scolytinae_0", "Scolytinae_1", "Scolytinae_2" corresponds to analyses where the fossils Cylindrobrotus pectinatus and Microborus inertus are excluded from analyses.
- Text file "Div_table.txt": This text file contains the assigned specific diversity value to each of the group considered.
- The column "Subfamily" refers to the considered clades, i.e., each of the groups described above.
- The column "N_species" correspond to the assigned specific diversity for each considered group.
- Text file "Table_all_RootBBB_analyses.txt": This text file contains the results obtained for each of the group considered above.
- The column "subfam" refers to the group considered, as defined above, but also depending on the parameters of the analysis, such as a different maximum age set with "-max_age" argument.
- The column "root_est" corresponds to the mean values of the age estimated for the considered group by the Bayesian Brownian Bridge model. The unit of the results is million years.
- The column "root_HPD" corresponds to the 95% highest posterior densities of the ages estimated for the considered group by the Bayesian Brownian Bridge model. The unit of the results is million years.
- The column "ext_est" and "ext_HPD" correspond to the estimated age of extinction and to the 95% highest posterior densities of the estimates. However here it is refered to as "NA" for each group because they are not extinct. Therefore no age estimates for extinction was produced.
- CSV file "Fossil_taxa.csv": This file contains all information about the fossils used in this study.
- The column "accepted_name" refers to the accepted taxonomic species name of the fossils used.
- The columns "min_age" and "max_age" refer to the stratigraphic age of the fossil considered, with respectively their minimum stratigraphic age and their maximum stratigraphic age. The unit for both columns is millions years.
- The column "primary_reference" contains all the references that either describe the fossils used, or precise its taxonomic assignment.
File S5. R scripts used to plot the results from the rootBBB analyses and to calculate the best fit priors for TED and ND analyses.
Description
Two R scripts used in the study: one to calculate best fitting priors for TED and ND analyses, the other to plot rootBBB results.
The R script used to calculate best fitting priors is derived from Ronquist et al. (2012).
