This README_updated.txt file was generated on 2022-06-10 by Emily Troyer. Updated 2022-09-22 to include Zenodo link for R scripts. GENERAL INFORMATION 1. Title of Dataset: Data from: The Impact of Paleoclimatic Changes on Body Size Evolution in Marine Fishes 2. Author Information Co-investigator 1 Name: Emily Troyer Institution: University of Oklahoma Co-investigator 2 Name: Ricardo Betancur-R Institution: University of Oklahoma Co-investigator 3 Name: Lily C. Hughes Institution: University of Chicago Co-investigator 4 Name: Mark Westneat Institution: University of Chicago Co-investigator 5 Name: Giorgio Carnevale Institution: University of Turin Co-investigator 6 Name: William T. White Institution: CSIRO Australian National Fish Collection Co-investigator 7 Name: John J. Pogonoski Institution: CSIRO Australian National Fish Collection Co-investigator 8 Name: James C. Tyler Institution: Smithsonian Institution Co-investigator 9 Name: Carole C. Baldwin Institution: Smithsonian Institution Co-investigator 10 Name: Guillermo Orti Institution: The George Washington University Co-investigator 11 Name: Andrew Brinkworth Institution: University of Bath Co-investigator 12 Name: Julien Clavel Institution: Claude Bernard Lyon University Corresponding Investigator Name: Dahiana Arcila Institution: University of Oklahoma, Sam Noble Oklahoma Museum of Natural History Email: dahiana.arcila@ou.edu 3. Date of data collection: 2017-2021 4. Funding sources that supported the collection of the data: National Science Foundation (NSF) grants to D.A (DEB-2015404, DEB-2144325, and DBI- 2131464), R.B.R. (DEB-1932759 and DEB-1929248), G.O. (DEB-1457426 and DEB-1541554), and C.B. (DEB-1541552). 5. Recommended citation for this dataset: Troyer et al. (2022), Data from: The Impact of Paleoclimatic Changes on Body Size Evolution in Marine Fishes, Dryad, Dataset 6. Note- The following species names have been updated in the final trees and supplementary information tables to reflect taxon changes or misidentification issues: Arothron reticularis to Arothron hispidus Rhynchostracion nasus to Ostracion nasus Lophiomus cf. setigerus to Lophiomus setigerus Pseudalutarius cf. nasicornis to Pseudalutarius nasicornis Halimochirurgus platycheilus to Halimochirurgus sp. Xenobalistes tumidipectoris to Xanthichthys caeruleolineatus Lagocephalus sceleratus to Lagocephalus cf. suezensis DATA & FILE OVERVIEW 1. Description of dataset These data were generated to investigate effect of paleoclimate on the evolution of body size in tetraodontiform fishes. 2. File List: File 1 Name: R_code_data.zip File 1 Description: R scripts and input data used in this study. This is deposited to Zenodo (https://doi.org/10.5281/zenodo.7105332). File 2 Name: SI_Dataset_S2_NCBI_vouchers.xlsx File 2 Description: Dataset S2. Additional species included in this study obtained from National Center for Biotechnology Information (NCBI) and Ensembl. Composite species are indicated with an asterisk. File 3 Name: SI_Dataset_S3_Tetraodontiform_Body_Length_Fossil_Ages.xlsx File 3 Description: Dataset S3. List of fossil ages and body size information for all species in this study. Body length data were obtained from museum records, published studies, and FishBase. The 'All Data' tab displays the entirety of the body length data collected, for multiple individuals per species, along with total length (TL) and standard length (SL) (if available), number of specimens per voucher, voucher number of the specimen (if applicable), fossil age and locality (if applicable), and finally a citation referencing from where the body size data were obtained. The 'Summary Table' tab summarizes the 'All Data' for each species. Here we list the maximum recorded SL for each species in centimeters, the log-transformed maximum SL, and the log-transformed mean maximum SL. We chose to use mean maximum SL in all analyses included in this study due to the bias of collecting smaller specimens in museum collections. To obtain the mean maximum SL, we omitted any measurements from individuals that were more than 20% smaller than the maximum recorded SL for that species, leaving one to three individuals which were then averaged to obtain a mean maximum length for each species. This spreadsheet lists the standard lengths of these one to three specimens, as well as their log-transformed standard lengths. Finally, we list the number of specimens that fell into the top 20% as well as the standard deviation of the log-transformed specimen lengths and the standard error (SE) of the mean, which is calculated as the SD divided by the square root of the sample size (Number of specimens). File 4 Name: SI_Dataset_S4_Individual_gene_trees_1103_IQTREE_parts.zip File 4 Description: Dataset S4. Zip file containing individual gene trees estimated from IQTREE for 1,103 exons. File 5 Name: SI_Dataset_S5_ASTRAL_parts_7347511_bootstraps.tre File 5 Description: Dataset S5. Tree file for Figure S1. Phylogeny of Tetraodontiformes based on coalescent analysis of 1,103 exons and 152 individuals representing 138 species (134 tetraodontiforms, 4 outgroups). Phylogenetic tree inferred with ASTRAL. File 6 Name: SI_Dataset_S6_IQTREE_Best_partition_scheme.part.contree.tre File 6 Description: Dataset S6. Tree file for Figure S2. Phylogeny of Tetraodontiformes based on concatenation analysis of 1,103 exons and 152 individuals representing 138 species (134 tetraodontiforms, 4 outgroups). Phylogenetic tree inferred with IQTREE using the best fit partition scheme identified with PartitionFinder for all newly sequenced taxa and four previously published transcriptomes. File 7 Name: SI_Dataset_S7_IQTREE_under_65__missing.contree File 7 Description: Dataset S7. Tree file for Figure S3. Phylogeny of Tetraodontiformes based on concatenation analysis of 1,103 exons and 114 individuals representing 105 species (102 tetraodontiforms, 3 outgroups). Phylogenetic tree inferred with IQTREE using the best fit partition scheme identified in PartitionFinder for all newly sequenced taxa and four previously published transcriptomes excluding taxa with more than 65% missing data. File 8 Name: SI_Dataset_S8_AMAS_Summary_by_gene_updated.xlsx File 8 Description: Dataset S8. Alignment Manipulation And Summary (AMAS) summary of the exon dataset properties on a per locus basis. File 9 Name: SI_Dataset_S9_Volume_surface_area_data_CT_scans.xlsx File 9 Description: Dataset S9. Volume and surface area data for subset of 41 tetraodontiforms in 10 extant families used in this study. Volume and surface area for each species is calculated from 3-D models generated in the image computing platform Slicer based on publicly available computed-tomography (CT) scan data accessed from MorphoSource.org. File 10 Name: SI_Dataset_S10_BOLD_matches_COI_gene.xlsx File 10 Description: Dataset S10. Barcode of Life Data System (BOLD) top matches of Cytochrome c oxidase I (COI) gene. File 11 Name SI_Dataset_S11_MrBayes_10k_trees_datedPT_outputMCC_TreeAnnotator.tre File 11 Description: Dataset S11. Tree file for Figure S4. Maximum clade credibility (MCC) tree of Tetraodontiformes. Time-calibrated phylogenetic tree based on Bayesian inference of 1,103 exons and 246 individuals, representing 237 tetraodontiform species (52 fossil, 185 extant) and seven outgroup species. MCC tree generated from 10,000 trees evenly selected from the posterior distribution of five subsets. File 12 Name: SI_Dataset_S12_MrBayes_10k_datedPT_noplecto_outputMCC_TreeAnnotator.tre File 12 Description: Dataset S12. Tree file for Figure S5. Maximum clade credibility (MCC) tree of Tetraodontiformes, excluding the superfamily Plectocretacicoidea. Time-calibrated phylogenetic tree based on Bayesian inference of 1,103 exons and 242 individuals, representing 233 tetraodontiform species (48 fossil, 185 extant) and seven outgroup species. MCC tree generated from 10,000 trees evenly selected from the posterior distribution of five subsets. File 13 Name: SI_Dataset_S13_Tetraodontiformes_Bodysize_TL_SL_calculations.xlsx File 13 Description: Dataset S13. Total length to standard length calculations for the ten extant tetraodontiform families.