Phylogenomics establishes an Early Miocene reconstruction of reef vertebrate diversity
Data files
Mar 21, 2025 version files 8.09 GB
-
Astral_Data_S1.zip
5.85 MB
-
BAMM_Data_S4.zip
61.52 MB
-
Bayou_Data_S10.zip
334.11 KB
-
BioGeoBears_Data_S6.zip
165.70 MB
-
Categorical_Traits_Data_S7.zip
42.02 MB
-
Continuous_Traits_Data_S9.zip
10.13 MB
-
HiSSE_Data_S8.zip
554.47 KB
-
IQ-TREE_Data_S2.zip
19.45 MB
-
PBDB_data_Data_S11.zip
620.17 KB
-
README.md
10.60 KB
-
TESS_Data_S5.zip
1.50 MB
-
Time_Calibration_Data_S3.tgz
7.79 GB
Abstract
Oceans blanket more than two-thirds of Earth’s surface, yet marine biodiversity is disproportionately concentrated in near-shore habitats, especially coral reefs. Investigating the origins of the exceptional species diversity found on coral reefs is crucial for predicting how these ecosystems will respond to anthropogenic disturbances. Here, we use a genome-scale dataset to reconstruct the evolutionary history of the wrasses and parrotfishes (Labridae), which rank among the most species-rich and ecologically diverse lineages of reef fishes. We show that all major labrid clades experienced concurrent pulses of evolutionary innovation and rapid lineage diversification over a period of fewer than five million years during the Early Miocene. Analyses of historical biogeography, character evolution, and phenotypic diversification rates demonstrate that no single phenotypic innovation can explain this period of accelerated diversification in wrasses. Instead, modern labrid diversity is a mosaic of multiple concurrent adaptive and non-adaptive radiations that diversified ~20-15 million years ago. These results draw parallels to the evolutionary histories of many animal groups after the Cretaceous-Paleogene mass extinction, suggesting a marked change in marine biodiversity associated with the Miocene emergence of novel ecological opportunities that wrasses exploited. Our results corroborate recently reported fossil evidence for an Early Miocene extinction event in oceanic vertebrates that we tie to changes in coral reef faunal composition and ubiquity and support this period as a crucial time in the assembly of present-day oceanic ecosystems.
https://doi.org/10.5061/dryad.f7m0cfz5x
Description of the data and file structure
Molecular data was collected via sequencing of UCEs. Morphological data was collected from the published literature.
Files and variables
File: Astral_Data_S1.zip
Description: ASTRAL directory with output ASTRAL multispecies coalescent tree.
ASTRAL_TREE.tree #output ASTRAL-III species tree
ASTRAL_TREE.tree.pdf # ASTRAL-III species tree PDF plot
labrid.trees #input gene trees
File: IQ-TREE_Data_S2.zip
Description: Gene trees, single and multiple partition concatenated maximum likelihood trees, and concordance factor output from analyses in IQ-TREE2.
iqtree_concat #concatenated single partition run
-labrid_cialign_concat_19_12_23.bionj #bionj output
-labrid_cialign_concat_19_12_23.ckp.gz #ckp output
-labrid_cialign_concat_19_12_23.contree #consensus tree topology
-labrid_cialign_concat_19_12_23.iqtree #iqtree file output
-labrid_cialign_concat_19_12_23.log #log file
-labrid_cialign_concat_19_12_23.mldist #maximum likelihood pairwise distances between sequences
-labrid_cialign_concat_19_12_23.model.gz #model fit output
-labrid_cialign_concat_19_12_23.splits.nex #splits file
-labrid_cialign_concat_19_12_23.treefile #maximum likelihood concatenated single partition tree
iqtree_concordance_factors #concordance factors run
-concordance_data.csv #csv file containing concordance factors and branch lengths for easy reading into R
-concordance_factors_11_1_2023.cf.branch #branch file output
-concordance_factors_11_1_2023.cf.stat #statistics, including concordance factors
-concordance_factors_11_1_2023.cf.tree #concordance factors annotated on the tree
-concordance_factors_11_1_2023.cf.tree.nex #concordance factors annotated on the tree, nexus format
-concordance_factors_11_1_2023.log #log file
Partitioned_tree #concatenated multiple partition run
-IQTree_shrunk_partitioned_7Nov2023.tree #multiple partitions tree file
shrunk_iqtree_gene_trees #trees generated for each UCE locus, labeled by locus and in .tree format
File: Time_Calibration_Data_S3.tgz
Description: Input and output files from Bayesian tip and node dating analyses in BEAST2.
BEAST_target.tree #target tree file
Individuals_for_BEAST copy.xlsx #list of individual sequences used in BEAST and analyses
Output_GTR_node #GTR model node-dating output, including output log and tree files, combined posterior tree sets, and summary tree
Output_GTR_tip #GTR model tip-dating output, including output log and tree files, combined posterior tree sets, and summary tree
Output_HKY_node #HKY model node-dating output, including output log and tree files, combined posterior tree sets, and summary tree
Output_HKY_tip #HKY model tip-dating output, including output log and tree files, combined posterior tree sets, and summary tree
xml_GTR_node #GTR node-dating input XML files to be read into BEAST
xml_GTR_tip #GTR tip-dating input XML files to be read into BEAST
xml_HKY_node #HKY node-dating input XML files to be read into BEAST
xml_HKY_tip #HKY tip-dating input XML files to be read into BEAST
File: BAMM_Data_S4.zip
Description: BAMM analysis directory outputs.
BAMM_Folder_GTR #containing BAMM input + output for node-dated phylogeny using the GTR model
-chain_swap.txt #chain swap output file
-control_file.txt #input control file
-event_data.txt #event data output file
-Labrid_Ultra_Min.tree #input ultrametric tree
-mcmc_out.txt #output MCMC files
-myPriors.txt #Prior block generated using BAMMTools
-run_info.txt #run info
BAMM_Folder_HKY #containing BAMM input + output for node-dated phylogeny using the HKY model
-chain_swap.txt #chain swap output file
-control_file.txt #input control file
-event_data.txt #event data output file
-Labrid_Ultra_Min.tree #input ultrametric tree
-mcmc_out.txt #output MCMC files
-myPriors.txt #Prior block generated using BAMMTools
-run_info.txt #run info
File: TESS_Data_S5.zip
Description: TESS analysis directory outputs.
tess_analysis_GTR_one_shift #input and output for GTR node-dated phylogeny run with one prior shift
tess_analysis_GTR_three_shift #input and output for GTR node-dated phylogeny run with 3 prior shifts
tess_analysis_GTR_two_shift #input and output for GTR node-dated phylogeny run with 2 prior shifts
tess_analysis_HKY_one_shift #input and output for HKY node-dated phylogeny run with one prior shift
tess_analysis_HKY_three_Shifts #input and output for HKY node-dated phylogeny run with 3 prior shifts
tess_analysis_HKY_two_shifts #input and output for HKY node-dated phylogeny run with 2 prior shifts
File: BioGeoBears_Data_S6.zip
Description: Input and output files from ancestral region reconstructions using BioGeoBears in R.
BioGeoBears_GTR #run using GTR node-dated phylogeny
-Plots #PDF plots of historical biogeographic reconstructions made using all 6 models, including pie chart plots showing probabilities at nodes
-ana_subsetXYZ.csv #csv file containing anagenetic events from stochastic mapping
-BSM #directory containing other output files from stochastic mapping
-Labrid BAYAREALIKE #BAYAREALIKE model output
-model_comparisons #directory containing output tables with summary statistics for model fit comparisons
-Labrid BAYAREALIKE+j #BAYAREALIKE + j model output
-Labrid DIVALIKE+j #DIVALIKE+j model output
-Labrid DIVALIKE #DIVALIKE model output
-Labrid DEC+j #DEC+j model output
-Labrid_DEC #DEC model output
-labrid_node_date_minBL.newick #tree with minimum branch lengths >0 enforced
-labrid_node_date.newick #newick converted tree
-Labrid_geography_BSM.txt #geography binary file for biogeographic stochastic mapping
-Labrid_geography.txt #geography binary file for BioGeoBears runs
BioGeoBears_HKY #run using HKY node-dated phylogeny
-Plots #PDF plots of historical biogeographic reconstructions made using all 6 models, including pie chart plots showing probabilities at nodes
-ana_subsetXYZ.csv #csv file containing anagenetic events from stochastic mapping
-BSM #directory containing other output files from stochastic mapping
-Labrid BAYAREALIKE #BAYAREALIKE model output
-model_comparisons #directory containing output tables with summary statistics for model fit comparisons
-Labrid BAYAREALIKE+j #BAYAREALIKE + j model output
-Labrid DIVALIKE+j #DIVALIKE+j model output
-Labrid DIVALIKE #DIVALIKE model output
-Labrid DEC+j #DEC+j model output
-Labrid_DEC #DEC model output
-labrid_node_date_minBL.newick #tree with minimum branch lengths >0 enforced
-labrid_node_date.newick #newick converted tree
-Labrid_geography.txt #geography binary file for BioGeoBears runs
File: Categorical_Traits_Data_S7.zip
Description: Input and output files from analyses of discrete characters, including SSE input and output and ancestral state reconstruction input and output plots.
biting.csv #input csv file containing biting trait presence data
Cleaning.csv #input csv file containing cleaning trait presence data
GTR_Node_SUM.tree #input tree file
Plots_Ancestral_State_recons #output ancestral state reconstruction plots
Reef_association.csv #input csv file containing reef association trait presence data
SSE_Model_Fit.csv #csv file containing ancestral state reconstruction model fit comparisons for each trait
trait_data.csv #input csv file containing trait presence data for the traits examined by Burress and Wainwright (2019), Evolution.
File: HiSSE_Data_S8.zip
Description: Output files from analyses using the R package HiSSE. Each directory contains an xlsx file including model fit comparisons for analyses of each trait, Rdata files for the four models studied in each case, and a csv file containing the solution data.
Biting_Suction_Both #biting trait
Coalesced_Premaxillary_Teeth #coalesced premaxillary teeth trait
IMJ #intramandibular joint trait
Phyllodont_dentition #phyllodont dentition trait
PJA #parrotfish pharyngeal jaw trait
Recurved_Oral_Teeth #recurved oral teeth trait
Reef_association #reef association trait
File: Continuous_Traits_Data_S9.zip
Description: Input and output files from analyses of continuous characters, including phylomorphospace and disparity through time plots.
NH_Plots #directory containing node height test plots for each examined trait
GTR_Node_SUM.tree #input tree
Size_and_SkullTraits.csv #input csv file containing continuous trait data for body size and skull traits
Table_S2_Geiger.xlsx #output Geiger model fit and node height test output from R; also see Supplementary Table 2
Fin_Aspect.csv #input csv file containing fin aspect ration continuous data
Protrusion_Resid_Vec.Rdata #R data file, protrusion trait
AdductorMand_Resid_Vec.Rdata #R data file, M. adductor mandibulae trait
Levator_Resid_Vec.Rdata #R data file, M. levator trait
Sterno_Resid_Vec.Rdata #R data file. M. sternohyoideus
Gape_Resid_Vec.Rdata #R data file, Gape trait
Table_S3_pca_Loadings.csv #PCA loadings from principal components analysis of the size and skull traits
DTT_Plots #directory containing disparity through time plots
File: Bayou_Data_S10.zip
Description: Output files from analyses using the R package bayou, including R data files and plots of trees annotated with shift probabilities for each trait.
AdductorMandibulae
FinAspect_NotCorrected
Gape
Hyoid_KT_NotCorrected
JawClosing_NotCorrected
JawKT_NotCorrected
JawOpening_NotCorrected
LevatorPosterior
Mass
Protrusion
SL
Sternohyoides
Optima_Shifts_through_time.csv #csv file containing the counted number of shifts and ages of nodes where shifts occur, across labrids and analyzed traits.
File: PBDB_data_Data_S11.zip
Description: Cleaned PBDB list of fossil labrid and acanthomorph occurrences.
pbdb_data_clean.csv #acanthomorphs
pbdb_labrid_clean.csv #labrids
plots #curves showing differences when maximum vs minimum labrid fossil ages are used
Code/software
LabridScript.R contains all code used in the analyses presented in this paper. The XML files contained in the “Time_Calibration_Data_S3 “ file include all specifiers for the BEAST2 analyses and can be directly analyzed in BEAST2.
Access information
Other publicly accessible locations of the data:
- NCBI Genbank Repository will include all sequences used in this manuscript
Please see the main text of the paper for details. Key permitting and tissue loaning information:
JT collected specimens under Colombian permit number 20182200001023 (Colombian National Natural Parks) and ANLA resolution 1070 28 (August 2015). We also thank Y.K. Tea for providing several cirrhilabrine tissue samples and for discussions related to this paper. We thank G.J Watkins-Colwell of the Yale Peabody Museum for assisting with DNA sample curation and transfer, J. Klunk from Arbor Bioscience for assistance with target capture sequencing, J. Hung for assistance with DNA extractions and quantification, S. Klanten, R. Robertson, and L. Liggins for providing several parrotfish and wrasse tissue samples, and collections staff at the Australian Museum, Museum Victoria, Te Papa Museum for assisting with tissue loans.