Data from: Non-adaptive radiation promotes phenotypic diversification and convergent evolution of aposematic mimicry in a highly diverse genus of Megaloptera
Abstract
Evolutionary radiations are considered key processes underlying the origin of biodiversity. Notably, the mechanisms driving these radiations can vary across organisms and often involve a complex interplay of abiotic and biotic factors. Empirical studies on evolutionary history are crucial for validation of multiple hypothesis regarding the mode of evolutionary radiations. Within the obligatory aquatic insect order Megaloptera, the genus Protohermes is the most speciose clade with 90 described species, accounting for around 22 % of the total ordinal diversity. Protohermes species are featured by the limited dispersal ability, primarily occurring across the Oriental region, and a range of diversified phenotypes, e.g., highly divergent genital characters, and mimetic coloration alongside shifts in biological rhythm—from nocturnal to diurnal activities. Here, we infer the spatiotemporal mode of diversification and associated driving factors of the Protohermes radiation as a test case for exploring the processes and potential mechanisms of evolutionary radiations. We present the first time-calibrated phylogeny of Protohermes using genome-scale data of ultraconserved elements (UCEs) and mitochondrial genes with a comprehensive taxon sampling. Our results reveal a mid-Cretaceous stem age of Protohermes, followed by a recent and steady diversification during the Neogene. Estimation of historical biogeography suggests the genus likely originated from a broad range including the Himalaya-Hengduan Mountains + Indochina + Borneo, with the first two areas serving as the center of early diversification. Our results further suggest that vicariance events, likely attributed to the Cenozoic Himalayan orogeny as well as climate change in East Asia, triggered speciation that coincided with the accumulation of genital divergence. Further enhancement of genital and phenotypic diversification may have been promoted by secondary contacts of allopatric or parapatric lineages following the build-up of species richness, likely facilitating species coexistence and lineage accumulation. We argue that the current species diversity of Protohermes likely resulted from a non-adaptive radiation. Our results highlight the role of geographic vicariance and sexual selection in driving the species and phenotypic diversification in insects.
Authors
- Data Author: Yuezheng Tu
- Contact: tuyuezheng@126.com
- Article Authors:
Tu Y.Z., Li X.K., Hayashi F., Zhang F., Yang D., Condamine F.L., Liu X.Y.
Date Created
2025-04-22
Data Directory Structure
Data/
├── 1.1-assemblies/ # de novo assembled genomes (.fa) used in the present study
├── 1.2-UCE-probe/
│ └── Protohermes-v1-master-probe-list-DUPE-SCREENED.fasta # lineage-specific UCE probe set deseigned for the present study
├── 1.3-matrix_alignments/
│ ├── 1.3.1-mtPRO/ # algnments of mitochondrial protein-coding genes in the mtPRO matrix
│ ├── 1.3.2-uceCORY90/ # algnments of UCE loci in the uceCORY90 matrix
│ ├── 1.3.3-ucePRO70/ # algnments of UCE loci in the ucePRO70 matrix
│ └── 1.3.4-ucePRO90/ # algnments of UCE loci in the ucePRO90 matrix
├── 2.1-phylogeny_IQTree/ # concatenated alignments (.fas) and partitioning schemes (.txt) used for phylogenetic reconstruction in IQ-Tree 2 and their results (.treefile)
│ ├── 2.1.1-Corydalinae/
│ │ ├── uceCORY90_FcC_supermatrix.fas # concatenated matrix for the uceCORY90 dataset
│ │ ├── uceCORY90_partition_by_locus.txt # partitioning scheme (by UCE locus) for maximum likelihood (ML) phylogenetic inference
│ │ ├── uceCORY90_partition_by_locus.treefile # result from IQ-TREE 2 (dataset partitioned by locus)
│ │ ├── uceCORY90_partition_by_swsc-en.txt # partitioning scheme (by SWSC-EN) for ML phylogenetic inference
│ │ └── uceCORY90_partition_by_swsc-en.treefile # result from IQ-TREE 2 (dataset partitioned by SWSC-EN)
│ └── 2.1.2-Protohermes/
│ ├── 2.1.2.1-ucePRO70/
│ │ ├── ucePRO70_FcC_supermatrix.fas # concatenated matrix for the ucePRO70 dataset
│ │ ├── ucePRO70_partition_by_locus.txt # partitioning scheme (by UCE locus) for ML phylogenetic inference
│ │ └── ucePRO70_partition_by_locus.treefile # result from IQ-TREE 2 (dataset partitioned by locus)
│ ├── 2.1.2.2-ucePRO90/
│ │ ├── ucePRO90_FcC_supermatrix.fas # concatenated matrix for the ucePRO90 dataset
│ │ ├── ucePRO90_partition_by_locus.txt # partitioning scheme (by UCE locus) for ML phylogenetic inference
│ │ ├── ucePRO90_partition_by_locus.treefile # result from IQ-TREE 2 (dataset partitioned by locus)
│ │ ├── ucePRO90_partition_by_swsc-en.txt # partitioning scheme (by SWSC-EN) for ML phylogenetic inference
│ │ └── ucePRO90_partition_by_swsc-en.treefile # result from IQ-TREE 2 (dataset partitioned by SWSC-EN)
│ ├── 2.1.2.3-ucePRO90+coi/
│ │ ├── ucePRO90swsc+coi_supermatrix.fas # concatenated matrix for the ucePRO90+coi dataset, including UCE loci and the mitochondrial COI gene
│ │ ├── ucePRO90swsc-en+coi_supermatrix_partition.txt # partitioning scheme (UCE loci by SWSC-EN) for ML phylogenetic inference
│ │ └── ucePRO90swsc-en+coi_supermatrix_partition.treefile # result from IQ-TREE 2
│ └── 2.1.2.4-mtPRO/
│ ├── mtPRO_FcC_supermatrix.fas # concatenated matrix for the mtPRO dataset (including 13 protein-coding mitochondrial genes)
│ ├── mtPRO_partition.txt # partitioning scheme (by gene locus) for ML phylogenetic inference
│ └── mtPRO_partition.treefile # result from IQ-TREE 2
├── 2.2-phylogeny_ASTRAL/ # gene trees used for phylogenetic reconstruction in ASTER
│ ├── uceCORY90all.gene.tre # gene trees of the uceCORY90 matrix
│ ├── ucePRO70all.gene.tre # gene trees of the ucePRO70 matrix
│ └── ucePRO90all.gene.tre # gene trees of the ucePRO90 matrix
├── 3-BEAST_divergence_time/
│ ├── 3.1-Corydalinae/
│ │ ├── 3.1.1-SortaDate-PartitionFinder/
│ │ │ ├── gene_selected.txt # list of the 50 uce loci selected by SortaDate for the dating analyses of Corydalinae
│ │ │ └── best_scheme.txt # partitioning scheme and best-fitting models for the dataset estimated by PartitionFinder 2.1.1
│ │ ├── 3.1.2-path_sampling/ # input files (.xml) for the path sampling (PS) analyses for different branching process priors, and the log files (.log) from the analyses
│ │ │ ├── 3.1.2.1-birth-death/
│ │ │ │ ├── PS_cory1120pt_bd.xml # input file for PS analysis (birth-death model)
│ │ │ │ └── path_sampling.log # output from the PS analysis
│ │ │ └── 3.1.2.2-yule/
│ │ │ ├── PS_cory1120pt_yule.xml # input file for PS analysis (yule model)
│ │ │ └── path_sampling.log # output from the PS analysis
│ │ └── 3.1.3-divergence_time_bd/
│ │ ├── cory1120pt_bd.xml # input file for the divergence estimation of Corydalinae using the birth-death model as branch process prior in BEAST 2.6.1
│ │ └── cory1120pt_bd.tre # maximum clade credibility (MCC) tree summarized from the dating results
│ └── 3.2-Protohermes/
│ ├── 3.2.1-SortaDate-PartitionFinder/
│ │ ├── gene_selected.txt # list of the 50 uce loci selected by SortaDate for the dating analyses of Protohermes
│ │ └── best_scheme.txt # partitioning scheme and best-fitting models for the dataset estimated by PartitionFinder 2.1.1
│ ├── 3.2.2-path_sampling/ # input files (.xml) for the path sampling analyses for different branching process priors, and the log files (.log) from the analyses
│ │ ├── 3.2.2.1-birth-death/
│ │ │ ├── PS_pro1121pt_bd.xml # input file for PS analysis (birth-death model)
│ │ │ └── path_sampling.log # output from the PS analysis
│ │ └── 3.2.2.2-yule/
│ │ ├── PS_pro1121pt_yule.xml # input file for PS analysis (yule model)
│ │ └── path_sampling.log # output from the PS analysis
│ └── 3.2.3-divergence_time_bd/
│ ├── pro1121pt_bd.xml # input file for the divergence estimation of Protohermes using the birth-death model as branch process prior in BEAST 2.6.1
│ ├── pro1125pt_bd.tre # maximum clade credibility (MCC) tree summarized from the dating results
│ └── pro.tre # MCC tree of Protohermes with outgroups removed; this time-calibrated tree was used in all the following analyses
├── 4.1-BioGeoBEARS/
│ ├── PRO_geog2.data # geography data used for biogeographic analyses in BioGeoBEARS
│ └── PBSM/ # rates estimated from BSM simulations
├── 4.2-Phenotypic_evolution/
│ ├── pro_coloration.csv # coloration data used for ancestral state estimation (ASE) in phytools
│ └── phenotype_pPCA/
│ ├── phenotypic_data.xlsx # original data of body markings (mean values of each taxa)
│ ├── pro_pheno.csv # penotypic data used for the phylogenetic principal components analysis (pPCA) in phytools
│ └── pro_ppc.csv # results from pPCA
├── 4.3-Genital_evolution/
│ ├── 4.3.1-Data/
│ │ ├── Trait_description.docx # description for the coded genital characters
│ │ ├── Pro_genitalia_matrix.nex # matrix of genital characters
│ │ └── character_dependency.txt # character dependency matrix for the PCoA analysis
│ └── 4.3.2-PCoA_results/
│ ├── Pro_genitalia_m_distance.csv # MORD distance matrix of genital characters
│ └── Pro_genitalia_pcov.csv # output from PCoA analysis
├── 4.4-Eco_evolution/
│ ├── 4.4.1-Data/
│ │ └── Pro_mean_env.csv # Mean values of ecological trait, represented by environmental variables obtained from WorldClim database and Hansen et al. (2013) using occurrence records of Protohermes species. Details for the variables are listed in Supplementary Appendix S1.
│ └── 4.4.2-pPCA_output/
│ └── pro_epc.csv # result from pPCA analysis
├── 4.5-HiSSE/
│ ├── pro_coloration.csv # binary coloration data coded for the HiSSE analyses (coding strategy described in the "material and methods" section of the main text)
│ └── pro_marking.csv # binary wingmarking data used in the HiSSE analyses (detailed coding strategy described in the "material and methods" section of the main text; 1 = wing markings present, 0 = wing markings absent)
├── 4.6-GeoHiSSE/ # biogeographic data coded for the GeoHiSSE analyses (detailed coding strategy described in the "material and methods" section of the main text; 1 = endemic to the targeted area, 2 = absent from the targeted area, 0 = distributed both in the targeted area and other area)
│ ├── pro_geo_Borneo.csv # geographic data (targeted area: Boreno)
│ ├── pro_geo_Himalayas_HDM.csv # geographic data (targeted area: Himalayas_HDM)
│ ├── pro_geo_Indochina.csv # geographic data (targeted area: Indochina)
│ └── pro_geo_SChina.csv # geographic data (targeted area: S. China)
├── 4.7-ClaDS/
│ └── ClaDS_commands.txt # commands used for the clads analysis in the julia package PANDA.jl
├── 4.8-RevBayes/
│ ├── pro_EBD.Rev # script used for time-depedent diversification analysis in RevBayes
│ ├── mcmc_EBD_HSMRF_temp.Rev # script used for paleotemperature-dependent diversification analysis in RevBayes
│ └── mcmc_EBD_HSMRF_monsoon.Rev # script used for paleomonsoon-dependent diversification analysis in RevBayes
└── 4.9-piecewiseSEM/
├── Grids_selected_data_sd2.csv # input data for piecewiseSEM analysis (methods used to process raw data are described in supplementary Appendix S1)
├── piecewiseSEM.R # script for piecewiseSEM analysis
└── SEM_result.txt # results from piecewiseSEM analysis
- Tu, Yuezheng; Li, Xuankun; Hayashi, Fumio et al. (2025). Non-adaptive Radiation Promotes Phenotypic Diversification and Convergent Evolution of Aposematic Mimicry in a Highly Diverse Genus of Megaloptera. Systematic Biology. https://doi.org/10.1093/sysbio/syaf030
