Data from: Phylogenomics and a new fossil synthesis illuminate the early evolution of palms (Arecaceae)
Data files
Apr 28, 2026 version files 66.23 MB
-
Alignments_and_Trees_for_Dryad.zip
66.23 MB
-
README.md
3.50 KB
Abstract
Tropical rainforests are home to almost half of plant diversity, yet a shortfall in phylogenetic hypotheses for tropical plants hinders our understanding of how rainforests have formed and adapted to past global changes. Phylogenetic and historical biogeographic evidence from key rainforest lineages, such as palms (Arecaceae), is required to illuminate the history of these ecosystems. However, our current understanding of the palm tree of life is based on uneven sampling of plastid and nuclear data. Moreover, numerous palm genera and palm fossils have been described or revised over the past decade, casting doubt on palm relationships, ages, and ancestral ranges inferred in early studies. Here, we infer the phylogenetic relationships of all 184 palm genera based on data from 1,033 nuclear genes generated using target sequence capture. Our palm phylogenomic tree is highly resolved and supported. Remaining areas of ambiguity reflect the complex dynamics of palm evolution, including rapid diversification events in subfamily Arecoideae and putative cases of ancient reticulation throughout the family. We undertake a comprehensive review of the palm fossil record and use a vetted selection of fossils to estimate divergence times with two Bayesian methods, the first based on calibration of five nodes using the age of fossils assigned to them, and the second based on co-estimation of divergence times and phylogenetic placements of 113 fossils with a Fossilized Birth-Death model. We then use the distribution ranges of extant and fossil taxa to infer ancestral ranges. We show that the palm family first diversified in the Early Cretaceous in regions corresponding to what is now North, Central, and South America and Oceania, that many tribes and subtribes had originated by the Late Cretaceous, and that two-thirds of the genera had diverged by the Oligocene. Fossil-informed analyses provide a more complex picture of the early biogeography of palms than analyses relying only on the ranges of extant taxa. Despite uncertainties regarding fossil placement, it is clear that palms dispersed dozens of times across oceanic gaps, and that dispersal and extirpation patterns are consistent with an ancient affinity of palms for megathermal climates. Our dated phylogenomic trees and curated fossil dataset provide a new foundation for evolutionary studies on palms, opening the door to deeper research on the rainforest biome in which they thrive.
https://doi.org/10.5061/dryad.pzgmsbcwg
Description of the data and file structure
This repository contains the Supplementary Methods, Supplementary Figures (Bellot_et_al_Origin_and_Evolution_of_Palms_Supplementary_Material.pdf), and Supplementary Tables (Bellot_et_al_Origin_and_Evolution_of_Palms_Supplementary_Tables_S1_S2_S3_S4_S5_S6.xlsx) associated with the above publication, as well as alignment, gene tree, and species tree files (Alignments_and_Trees_for_Dryad.zip).
More details on the Supplementary Methods, Figures, and Tables are included at the beginning of the Supplementary Materials file.
The Alignments_and_Trees_for_Dryad.zip folder contains the following:
Raw_sequence_files_before_alignment.zip: A directory containing the sequences of the 1255 loci before alignment. The directory contains a file per locus, and each file contains all the sequences (samples) retrieved for the locus, in fasta format.t
Raw_sequence_alignments.zip: A directory containing the alignments of the 1211 loci that had enough sequences retrieved to produce an alignment, before alignment cleaning; each alignment is in a fasta format in a separate file
Clean_sequence_alignments.zip: A directory containing the 1118 alignments that remained after alignments were cleaned as described in the publication and Supplementary Methods file; each alignment is in a fasta format in a separate file
Gene_trees.zip: A directory containing all the 1117 gene trees obtained from the IQ-TREE analysis (the tree failed to be computed for one of the alignments), each tree in Newick format in a separate file
All_genes_trees.tre: A file with all the 1033 gene trees obtained from the IQ-TREE analysis and that were selected for downstream use as described in the publication. Each tree is in Newick format, with a tree per line.e
All_genes_trees_low_support_branches_collapsed.tre: A file with all the 1033 gene trees obtained from the IQ-TREE analysis and that were selected for downstream use as described in the publication, after the branches with bootstrap support lower than 10% were collapsed. Each tree is in Newick format, with a tree per line.e
Orthologs_genes_trees.tre: A file with the 231 orthologous gene trees obtained from the IQ-TREE analysis, in Newick format, with a tree per line
Orthologs_genes_trees_low_support_branches_collapsed.tre: A file with the 231 orthologous gene trees obtained from the IQ-TREE analysis after the branches with bootstrap support lower than 10% were collapsed,in Newickk format, with a tree per line
Species_tree_all_genes_rooted.tre: The rooted species tree (in newick format) obtained from the ASTRAL analysis of the trees in the All_genes_trees_low_support_branches_collapsed.tre file
Species_tree_orthologs_rooted.tre: The rooted species tree (in newick format) obtained from the ASTRAL analysis of the trees in the Orthologs_genes_trees_low_support_branches_collapsed.tre file
Sharing/Access information
The data is also publicly accessible on GitHub, together with the scripts used to perform the molecular dating analyses:
https://github.com/sidonieB/Bellot_et_al_Palm_Early_Evolution_Supplementary_Material_Version_2
