Data from: Phylogenomic insights into the relationship and the evolutionary history of planthoppers (Insecta: Hemiptera: Fulgoromorpha)
Data files
Dec 28, 2024 version files 18.47 GB
-
README.md
5.86 KB
-
Supporting_Information_Data_File_S1_iqtree_outputs_from_mitogenomes_of_149_species.tar.gz
2.72 MB
-
Supporting_Information_Data_File_S2_iqtree_outputs_from_mitogenomes_of_282_species.tar.gz
5.20 MB
-
Supporting_Information_Data_File_S3_iqtree_outputs_from_1164_nuclear_markers_of_149_species.tar.gz
5.70 GB
-
Supporting_Information_Data_File_S4_iqtree_outputs_from_nuclear_and_mitogenomes_of_246_species.tar.gz
5.91 GB
-
Supporting_Information_Data_File_S5_iqtree_outputs_from_nuclear_and_mitogenomes_of_285_species.tar.gz
3.55 GB
-
Supporting_Information_Data_File_S6_ASTRAL_outputs_based_on_1164_nuclear_markers_from_149_species.tar.gz
62.18 MB
-
Supporting_Information_Data_File_S7_Beast_runs.tar.gz
3.23 GB
Abstract
Planthoppers (Hemiptera: Fulgoromorpha) are a species-rich and globally distributed insect clade with high economic, ecological, and evolutionary importance. However, the relationships among planthopper lineages and families remain unclear. Previous efforts based on inconsistent morphological traits, a few genes, or limited sampling often resulted in conflicting tree topologies. Here, we used genome-level data to assemble 1164 nuclear single-copy genes and 13 mitochondrial protein-coding genes for 149 planthopper species representing 19 out of 21 extant families. Additional markers were added from published mitogenomes, expanding our sampling to 285 species. These markers were used for Maximum Likelihood-based tree inference and dating analyses. The newly inferred phylogenies validated well-accepted relationships and recovered novel placements for several taxa, including the family Achilixiidae and species from the tribe Lyncidini and genus Madagascaritia in Dictyopharidae and Fulgoridae. Based on molecular and morphological evidence, we proposed taxonomic changes including the establishment of a new family Borysthenidae stat. rev. within Delphacoidea and a new superfamily Meenoploidea superfam. nov. with redefined Kinnaridae stat. rev. and Meenoplidae stat. rev.. The time analyses based on 57 nuclear markers and 30 fossils dated the origin of extant Fulgoromorpha back to Guadalupian, Permian (~263 Ma), close to the maximum constraint at 267.3 Ma, while applying an older root constraint resulted in an origin in Mississippian, Carboniferous (~332 Ma). While future sampling of unstudied fauna in unexplored regions or habitats may change the topology, the current phylogenomic analysis will serve as a solid foundation for research into planthopper ecology, evolution, and significance.
README: Phylogenomic insights into the relationship and the evolutionary history of planthoppers (Insecta: Hemiptera: Fulgoromorpha)
https://doi.org/10.5061/dryad.3ffbg79sc
This data repository contains gene alignments and outputs from all phylogenomic analyses (tree inference and divergence time analysis) in Deng et al. (2024)
Description of the data and file structure
Supporting Information_Data File S1. iqtree outputs from mitogenomes of 149 species
Files with the prefix mtfulgo_149sp_wo3rd are outputs based on DNA models. The gene alignments and partitions used for this run are mtfulgo_nucl_149sp_final_wo3rd.phy and mtfulgo_nucl_149sp_final_wo3rd.nex
Files with the prefix mtfulgo_aa_149sp are outputs based on protein models. The gene alignments and partitions used for this run are mtfulgo_aa_149sp_final.phy and mtfulgo_aa_149sp_final.nex
The gene alignments WITH the 3rd codon position are mtfulgo_nucl_149sp_final.phy and mtfulgo_nucl_149sp_final.nex, which only served as templates for amino acid translation and were not used for any runs as described in the methods section.
Supporting Information_Data File S2. iqtree outputs from mitogenomes of 282 species
Files with the prefix mtfulgo_282sp_wo3rd are outputs based on DNA models. The gene alignments and partitions used for this run are mtfulgo_nucl_282sp_wo3rd.phy and mtfulgo_nucl_282sp_wo3rd.nex
Files with the prefix mtfulgo_aa_282sp are outputs based on protein models. The gene alignments and partitions used for this run are mtfulgo_aa_282sp.phy and mtfulgo_aa_282sp.nex
The gene alignments WITH the 3rd codon position are mtfulgo_nucl_282sp.phy and mtfulgo_nucl_282sp.nex, which only served as templates for amino acid translation and were not used for any runs as described in the methods section.
Supporting Information_Data File S3. iqtree outputs from 1164 nuclear markers of 149 species
Files with the prefix fulgo_149sp_wo3rd are outputs based on DNA models. The gene alignments and partitions used for this run are fulgo_nucl_final_wo3rd.phy and fulgo_nucl_final_wo3rd.nex
Files with the prefix fulgo_aa_149sp_merge are outputs based on protein models. The gene alignments and partitions used for this run are fulgo_aa_final.phy and fulgo_aa_final.nex
The gene alignments WITH the 3rd codon position are fulgo_nucl_final.phy and fulgo_nucl_final.nex, which only served as templates for amino acid translation and were not used for any runs as described in the methods section.
Supporting Information_Data File S4. iqtree outputs from nuclear+mitogenomes of 246 species
Files with the prefix fulgo_246sp_wo3rd are outputs based on DNA models. The gene alignments and partitions used for this run are fulgo_ZANNA_mito_nucl_final_wo3rd.phy and fulgo_ZANNA_mito_nucl_final_wo3rd.nex
Files with the prefix fulgo_aa_246sp_merge are outputs based on protein models. The gene alignments and partitions used for this run are fulgo_ZANNA_mito_aa_final.phy and fulgo_ZANNA_mito_aa_final.nex
The gene alignments WITH the 3rd codon position are fulgo_ZANNA_mito_nucl_final.phy and fulgo_ZANNA_mito_nucl_final.nex, which only served as templates for amino acid translation and were not used for any runs as described in the methods section.
Supporting Information_Data File S5. iqtree outputs from nuclear+mitogenomes of 285 species
Files with the prefix fulgo_285sp_wo3rd are outputs based on DNA models. The gene alignments and partitions used for this run are fulgo_nucl_285sp_wo3rd.phy and fulgo_nucl_285sp_wo3rd.nex
Files with the prefix fulgo_aa_285sp are outputs based on protein models. The gene alignments and partitions used for this run are fulgo_aa_285sp.phy and fulgo_aa_285sp.nex
The gene alignments WITH the 3rd codon position are fulgo_ZANNA_mito_nucl_final.phy and fulgo_ZANNA_mito_nucl_final.nex, which only served as templates for amino acid translation and were not used for any runs as described in the methods section.
Supporting Information_Data File S6. ASTRAL outputs based on 1164 nuclear markers from 149 species
The file all_trees_wo3rd_for_ASTRAL.tre contains ML trees inferred from individual gene alignments in the folder MSA_wo3rd.
The folder MSA contains individual gene alignments WITH the 3rd codon position, which were not used for this analysis
Supporting Information_Data File S7. Beast runs
The folder all_beast contains the input alignment of 57 genes (all.nex), the xml files for the "sample from prior" run (all.s1.prior) and three independent replicates (all.s1, all.s2, all.s3). All analyses in this folder were run with uniform fossil prior under age constraints based on fossil history and host plants (see methods section).
Other folders have a similar file structure but different settings:
all_beast_Johnson: 57 genes + uniform fossil prior + alternative age constraints from Johnson et al. (2018)
all_beast_logNormal_S1: 57 genes + log-normal fossil prior (S=1) + our age constraints
all_beast_logNormal_S05: 57 genes + log-normal fossil prior (S=0.5) + our age constraints
The remaining folders p1_beast, p2_beast, p3_beast, p4_beast, and p5_beast contain runs based on five subsets of 57 genes under uniform fossil prior and our age constraints.
Code/Software
No scripts were included in this repository.
The software used for data processing, such as iqtree, ASTRAL, and BEAST, were described in the methods section. The analysis pipeline can be found at https://github.com/junchen-deng/phylogenomic_analysis_pipeline.
Methods
Sample processing and metagenomic sequencing
Individuals of 151 planthopper species from 19 families were sampled ethically following all applicable international and local permitting requirements from their natural habitats around the world (Table S1). The remaining two established families, Gengidae and Hypochthonellidae, known only from limited specimens collected in southern Africa and, to our knowledge, with no molecular data published, are not included due to sampling difficulties. For all individuals, we extracted DNA from either the abdomen or bacteriome – an organ hosting symbiotic bacteria, using various DNA extraction kits (Table S1). Metagenomic libraries of each individual were prepared with NEBNext Ultra II DNA Library Prep Kit (BioLabs, New England) and sequenced on Illumina HiSeq 2500 in a Rapid Run mode (2x250bp), HiSeq X, or NovaSeq 6000 S4 (2x150bp reads) (Table S1).
Single-copy nuclear gene assembly and multiple sequence alignment
We prepared the multiple sequence alignment (MSA) for phylogenetic analysis in the following steps: 1) the preparation of a custom single-copy gene reference database; 2) single-copy gene assembly from metagenomic data; 3) alignment and filtering. The analysis pipeline and the custom Python scripts can be found on the GitHub page (https://github.com/junchen-deng/phylogenomic_analysis_pipeline).
To enable read mapping and gene assembly from generally low-coverage genomic datasets, we first assembled our custom single-copy gene database for planthoppers using BUSCO v5.4.4 (Manni et al., 2021). The hemipteran ortholog database (hemiptera_odb10, v2020-08-05), including 2510 single-copy orthologs, was downloaded from BUSCO. Then, we extracted from GenBank genome assemblies of species covering a wide range of evolutionary distances to planthoppers. This included chromosome-level assemblies from pea aphid Acyrthosiphon pisum (Sternorrhyncha: Aphididae), glassy-winged sharpshooter Homalodisca vitripennis (Cicadomorpha: Clypeata: Membracoidea), sugarcane spittlebug Callitettix versicolor (Cicadomorpha: Clypeata: Cercopidae) and brown planthopper Nilaparvata lugens (Fulgoromorpha: Delphacoidea), and scaffold-level assembly from ‘Zanna’ intricata (synonym to Pyrops intricatus; Fulgoromorpha: Fulgoroidea) (Table S1). We then fed these assemblies to BUSCO and identified 2447, 2098, 2340, 2322, and 1168 complete single-copy genes, respectively. Finally, we combined these five datasets with busco_ancestral from the BUSCO hemiptera_odb10 dataset and generated the custom database – in amino acid – for the next step.
In the second step, we assembled the single-copy genes (orthologs) presented in the custom database using the metagenome of each species. We first cleaned Illumina paired-end reads in each metagenomic dataset by adapter trimming and quality control using trim_galore v0.6.4 (settings: --length 80 -q 30; https://github.com/FelixKrueger/TrimGalore). The quality of clean reads was confirmed by FastQC v0.11.9 (https://github.com/s-andrews/FastQC). Then, HybPiper v2.1.2 was used to assemble single-copy genes in each clean metagenomic dataset (Johnson et al., 2016). HybPiper first performed target enrichment by blasting and mapping Illumina reads against the custom database using DIAMOND v2.0.15 (setting: -t_aa custom_db --diamond --diamond_sensitivity sensitive) (Buchfink et al., 2021). The clustered reads for each single-copy gene were then assembled with SPAdes v3.15.5 implemented in Hybpiper (Prjibelski et al., 2020). Due to the low coverage of the host genome in our metagenomic dataset, we reduced the coverage cutoff for SPAdes assembly to four (setting: --cov_cutoff 4) to increase the contig length without losing much base-level accuracy. This allowed us to assemble all 2510 single-copy genes presented in the custom database from at least some samples.
In the final step, we implemented several rounds of filtering to obtain clean alignments for each ortholog. We first ran hybpiper paralog_retriever to detect and remove potential paralogs. This reduced the dataset to 2313 genes. Then, we ran Fast Statistical Alignment (FSA) with fsa v1.15.9 to generate alignments of amino acids for each gene including the target species and two outgroups (H. vitripennis and C. versicolor) (Table S1; Bradley et al., 2009); FSA allows more reliable identification of non-homologous sequence, which improves overall alignment quality in our case because our data contained some reads of other non-target eukaryotic sources (e.g., symbiotic fungi and insects other than planthoppers, such as putative parasitoids) in our metagenomic data when using a low coverage cutoff for spades assembly in the previous step. Next, trimAl v1.4 was used to trim the amino acid alignments (setting: -automated1) (Capella-Gutierrez et al., 2009). The nucleotide alignments were generated by reverse translating the trimmed amino acid alignments based on the column numbers output by trimAl v1.4 (setting: -colnumbering). Then, we ran other custom Python scripts to filter trimmed alignments by keeping genes that 1) are longer than 200 amino acids and 2) were reconstructed for at least 75% of all target species (i.e. 112 out of 149 species in our case) with sequence length above 25% of the gene length. This further reduced the dataset to 1508 genes. Finally, we ran FastTree v2.1.11 to generate a Maximum Likelihood tree for each gene (setting: -nt -gtr) (Price et al., 2010). We then manually examined each gene tree by looking for unusual topologies (e.g., long branches of a single, few, or a clade of species); long branches of a single or few species can result from short sequence length or contamination while long branches of a clade of species may indicate potential paralogs not detected by Hybpiper. The alignments of gene trees with unusual topologies were further examined manually. The problematic alignments were either cleaned by manually cutting out the troublesome regions or sequences or excluded from the subsequent analyses. At this stage, we also decided to exclude two species, Stenocranus major and Parahydriena hyalina acuta, which were placed repeatedly with outgroups due to significant metagenome contamination from the DNA of putative parasitoid insects. In the end, we reached a final list of 1164 single-copy orthologs from 149 species.
Mitogenome assembly, annotation, and mitochondrial marker extraction
The clean Illumina reads obtained in previous steps after adaptor trimming and quality control were assembled using MEGAHIT v1.1.3 with k-mer sizes from 99 to 255 (Li et al., 2016). We then used NanoTax.py (https://github.com/junchen-deng/NanoTax), which performs blast searches using assembled contigs against a custom database containing previously published mitogenomes, to identify mitochondrial contigs. Complete or partial mitogenomes of 149 out of 151 planthopper species were assembled from metagenomes. The mitogenomes of S. major and P. hyalina acuta from contaminated metagenomes were successfully assembled and verified while the mitogenomes of two other species, Bebaiotes sp. and Omolicna uhleri, failed to assemble (Table S1). The identified mitochondrial contigs were then annotated with a custom Python script modified from (Łukasik et al., 2019). The script first extracted all the Open Reading Frames (ORFs) and their amino acid sequences from a mitogenome. Then, the ORFs were searched recursively using HMMER v3.3.1 (Eddy, 2011), through custom databases containing manually curated sets of planthopper mitochondrial protein-coding and rRNA genes. The rRNA genes were searched with nhmmer implemented in HMMER v3.3.1 (Wheeler and Eddy, 2013), and tRNAs were identified with tRNAscan-SE v2.0.7 (Chan et al., 2021). All 13 mitochondrial protein-coding genes (PCGs) were annotated and extracted for each species. Additionally, we extracted mitochondrial markers from previously published mitogenomes of 134 planthopper species on GenBank (published before February 2024; Table S1), mostly from Eastern Palearctic and Oriental taxa not well-represented in the primary dataset. Mitogenomes of four outgroups from Cicadomorpha, including the keeled treehopper Entylia carinata (superfamily Membracoidea), the cicada Tettigades undata (superfamily Cicadoidea), the meadow spittlebug Philaenus spumarius (superfamily Cercopoidea), and the leafhopper Macrosteles quadrimaculatus (superfamily Membracoidea), were included in the annotation (Table S1). The amino acid alignment of each mitochondrial marker was inferred with Mafft v7.475 (settings: --localpair --maxiterate 1000; Katoh and Standley, 2013). The nucleotide alignments were generated by reverse translation of amino acid alignments as described in the previous steps.
Phylogenetic inference
We prepared four concatenated datasets, with both amino acid and nucleotide alignments included in each dataset, for the phylogenetic analyses. These were 1) 13 mitochondrial PCGs (length: 6.7 kbp) from 149 species sequenced in this study; and 2) the same mitochondrial genes from 282 species including mitogenomes downloaded from GenBank; 3) the 1164 nuclear markers (length: 1.13 Mbp) from 149 species sequenced in this study; and 4) the combined dataset of 13 mitochondrial and 1164 nuclear markers (length: 1.14 Mbp) from 285 species. We reasoned that the two mitochondrial datasets would allow us to assess how the increasing sampling improves the phylogeny and make a more direct comparison with published phylogenies that were mostly based on mitochondrial markers.
To evaluate whether our datasets have evolved under globally stationary, reversible, and homogeneous (SRH) conditions, we applied matched-pairs tests of homogeneity (Bowker’s test) implemented in SymTest v2.0.55 (https://github.com/ottmi/symtest) to test for potential compositional heterogeneity (Jermiin et al., 2004; Misof et al., 2014). Tests were applied on 1) the amino acid dataset, 2) the nucleotide dataset with all codon positions, and 3) the nucleotide dataset including first and second codon positions only. The results indicate that the amino acid dataset exhibited much less compositional heterogeneity compared to the nucleotide datasets, of which the dataset containing only the first and second codon positions showed less among-lineage heterogeneity than the one including all codon positions (Fig. S1). Additionally, the third codon position possibly suffers strong substitution saturation after >200 million years of evolution (Johnson et al., 2018). Taking together, we used the amino acid dataset and the nucleotide dataset including only the first and second codon positions for the subsequent phylogenetic tree inference.
All datasets were partitioned by genes. All phylogenies were inferred with the Maximum Likelihood (ML) approach in IQ-TREE2 v2.2.0.3 (Minh et al., 2020). For nucleotide datasets, all available DNA models were tested in IQ-TREE2. For amino acid datasets, selected protein models Q.insect, WAG, LG, and JTT (setting: -mset Q.insect, WAG, LG, JTT) were tested on nuclear datasets and the model mtInv (setting: -mset mtInv) was tested on mitochondrial datasets for computational efficiency. To decide on the partitioning scheme and the substitution model, we asked IQ-TREE2 to perform extended model selection on each gene with free rate heterogeneity followed by tree inference and subsequently merge two or more genes until the model fit does not increase any further (setting: -m MFP+MERGE; Chernomor et al., 2016; Kalyaanamoorthy et al., 2017). The best partitioning scheme and the best-fit models were chosen by the highest BIC (Bayesian Information Criterion) scores. Bootstrapping was conducted using the approximate likelihood ratio test (SH-aLRT) and ultrafast bootstrap methods with 1000 replicates as well as the approximate Bayes test (setting: -B 1000 --alrt 1000 --abayes; Anisimova et al., 2011; Anisimova and Gascuel, 2006; Hoang and Chernomor, 2017). All other setting options were kept default.
To account for the differences in the evolutionary history of genes, we inferred the species tree from gene trees using the Accurate Species Tree ALgorithm (ASTRAL-III) (Zhang, 2018). Specifically, we used Weighted ASTRAL (wASTRAL v1.15.2.3) to take into account phylogenetic uncertainty by integrating signals from branch length and branch support in gene trees (Zhang, 2022). We used the nucleotide alignment of each gene without the third codon position as the input to IQ-TREE2, which inferred each gene tree with 1000 replicates of ultrafast bootstrapping. Then, the combined tree file was analyzed in wASTRAL with 16 rounds of search and subsampling (setting: -R).
Divergence time estimation with fossils
We used the nuclear dataset comprising the first and second codon positions for the molecular dating analyses. From all 1164 nuclear markers, we filtered out genes with less than 143 out of 149 (i.e. <95%) species present. This results in 57 markers. Then, we concatenated these alignments into one dataset. In addition, we made five gene subsets out of the 57 genes, each of which contains 11 to 12 randomly assigned markers. Each subset was concatenated into a separate dataset. We then decided on the best partitioning scheme and substitution model for each dataset in IQ-TREE2 (setting: -m MF+MERGE).
For node calibration, we assigned 30 fossils as calibration points throughout the tree (Table S2). The taxonomic placement of each fossil was determined based on the apomorphy of the corresponding taxon. Fossil taxa were selected based on their verified taxonomic and chronostratigraphic placement, to cover as low classification levels as possible, especially for the lineages present in molecular analyses. The fossil age was used as the minimum constraint of the node. Maximum constraints were chosen based on the phylogenetic placement, estimated dates from previously published studies, and the fossil history of certain clades. For the root, the oldest fossils assigned to the extant lineages of Fulgoromorpha are Barremixius petrinus and Karebopodoides aptianus of family Cixiidae, dated to the Lower Cretaceous, Barremian (125.77 - 121.4 Ma; Luo et al., 2021; Maksoud et al., 2017). Thus, we placed the minimum constraint of the root at the calibrated age of the two fossils (125 Ma). For the maximum constraint of the root, we referred to the extinct superfamilies Coleoscytoidea, Surijokocixioidea, and Fulgoridioidea, for which the oldest fossils are known from the Permian, Roadian (273.15 - 266.5 Ma), the Permian, Wordian (267.3 - 264.12 Ma), and the Lower Jurassic, Hettangian (201.6 - 199.2 Ma), respectively (Bucher et al., 2024; Szwedo, 2018; Szwedo et al., 2004). It is generally accepted that Coleoscytoidea is a side group, distantly related to the modern crown Fulgoromorpha, while Surijokocixioidea and Fulgoridioidea are considered as stem groups to the modern crown group (Bourgoin and Szwedo, 2023, 2022; Szwedo, 2018). However, whether Surijokocixioidea or Fulgoridioidea is more closely related to the crown group is still under debate (Bourgoin and Szwedo, 2023; Brysz and Szwedo, 2019; Szwedo, 2017). We chose the maximum constraint of the root at the base of Wordian (267.3 Ma) where the oldest fossil of Surijokocixioidea resides. Johnson et al. (2018) estimated the root age of crown Fulgoromorpha at ~206 Ma with a 95% confidence interval (hereafter: CI) ranging from 178 to 241 Ma. Thus, our constraints can be considered relatively conservative (Table S2).
Regarding the maximum constraint for fossils in different modern families, the oldest fossils came from the Lower Cretaceous, Barremian (125.77 - 121.4 Ma) as mentioned above and all fossils from earlier periods, such as Jurassic (201.6 - 145 Ma), belong to the extinct, more ancient stem groups such as Fulgoridioidea. Thus, we placed the maximum constraint at the base of Jurassic (201.6 Ma) for fossils in families Cixiidae, Delphacidae, Kinnaridae, Achilidae, Derbidae, Fulgoridae, and Dictyopharidae. As for other relatively recently diverged families, since there are no clear patterns in fossil distribution, we referred to their host plants. Flowering plants, which are the main food source of all modern crown group planthoppers, first appeared in the fossil record in the early Cretaceous and began to diversify in the mid to later Cretaceous (Zuntini et al., 2024). It is very likely that the diversification of flowering plants drove the diversification of the derived planthopper families. Therefore, we placed the maximum constraint at the base of the Cretaceous (145 Ma) for families Tropiduchidae, Lophopidae, Nogodinidae, Issidae, Caliscelidae, and Ricaniidae (Table S2). The ages of all geologic periods were derived from the ICS International Chronostratigraphic Chart v2023/09 (http://www.stratigraphy.org/ICSchart/ChronostratChart2023-09.pdf).
We also applied a different set of maximum constraints based completely on previously inferred time trees. We referred to the work done by Johnson et al. (2018) where they produced a dated phylogeny for major Hemipteran clades including nine planthopper families. We chose the upper 95% confidence interval of the estimated age of the node that goes back one node deeper in the phylogeny as the maximum constraint for the target family. For example, the split between Acanaloniidae and Flatidae was dated to 76 Ma (95% CI: 58 - 96 Ma), and the split between the Acanaloniidae-Flatidae clade and Caliscelidae was dated to 90 Ma (95% CI: 70 - 109 Ma). Then, we chose 109 Ma as the maximum constraint for fossils in Flatidae and Acanaloniidae. Following this rule, we placed the maximum constraint at 241 Ma for fossils of Cixiidae, Delphacidae, Kinnaridae, Achilidae, Derbidae, Fulgoridae, and Dictyopharidae, 144 Ma for Tropiduchidae, and 123 Ma for Lophopidae, Nogodinidae, Issidae, Caliscelidae, and Ricaniidae. The maximum constraint of the root was set to 349 Ma as the upper 95% confidence interval of the estimated age of auchenorrhynchans (309 Ma, 95% CI: 275 - 349 Ma) (Table S2). The alternative constraints are more conservative for the root and relatively old families and less conservative for the derived families than our constraints based on fossil history and host plants.
The molecular dating analyses were performed in BEAST v2.7.6 (Bouckaert and Vaughan, 2019). We placed monophyly constraints on the nodes where both SH-aLRT and ultrafast bootstrap support were >95. For each dataset, the partition and its site model were set according to the best scheme found by IQ-TREE2 and were estimated later in BEAST. The rates of the branches were modeled under the optimized relaxed clock (Douglas et al., 2021). The uniform prior was applied to the root. For calibration points, we applied three different priors: uniform prior and log-normal prior with two different standard deviations (S = 1.0 or 0.5). For log-normal prior, we gave an offset equal to the minimum age and an adjusted mean (M) placing the maximum age at the 97.5% percentile. Other priors were kept default. Each dataset was run three times independently on each prior setting with a Markov Chain Monte Carlo (MCMC) chain length of at least 50 million to ensure good mixing. In addition, a run without sequence data (‘sample from prior’ in BEAST) was also performed for each dataset to examine the effective priors, compared to the specified priors. MCMC results were accessed with Tracer v1.7.2 (Rambaut et al., 2018). The verified log and tree files of the three independent runs using the same dataset were combined using LogCombiner v2.7.4 with 10% burnin. The maximum clade credibility tree of each dataset with mean node heights was summarised from the combined tree set by TreeAnnotator v2.7.4. All trees were visualized in FigTree v1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/). The node ages were extracted using the treeio R package (Wang et al., 2019). The summary of node ages was illustrated using Processing 3 (http://www.processing.org). All figures were edited in Inkscape (https://inkscape.org).