Data from: Evolutionary innovation accelerates morphological diversification in pufferfishes and their relatives

Data files

Sep 18, 2024 version files 952.29 MB

Appendix_1_CT_specimens_traits.xlsx

23.90 KB
Appendix_2_R_Data_Files_Scripts.zip

952.26 MB
README.md

3.32 KB

Abstract

Evolutionary innovations have played an important role in shaping the diversity of life on Earth. However, how these innovations arise, and their downstream effects on patterns of morphological diversification remain poorly understood. Here, we examine the impact of evolutionary innovation on trait diversification in tetraodontiform fishes (pufferfishes, boxfishes, ocean sunfishes, and allies). This order provides an ideal model system for studying morphological diversification owing to their range of habitats and divergent morphologies, including the fusion of the teeth into a beak in several families. Using three-dimensional geometric morphometric data for 176 extant and fossil species, we examine the effect of skull integration and novel habitat association on the evolution of innovation. Strong integration may be a requirement for rapid trait evolution and facilitating the evolution of innovative structures, like the tetraodontiform beak. Our results show that the beak arose in the presence of highly conserved patterns of integration across the skull, suggesting that integration did not limit the range of available phenotypes to tetraodontiforms. Furthermore, we find that beaks have allowed tetraodontiforms to diversify into novel ecological niches, irrespective of habitat. Our results suggest that general rules pertaining to evolutionary innovation may be more nuanced than previously thought.

https://doi.org/10.5061/dryad.4f4qrfjjx

1. Project Description:

Evolutionary analysis of skull shape in tetraodontiform fishes. This supplemental dataset contains supplemental Appendices 1 and 2, which contain all R scripts and code needed to replicate results in this study.

2. Contents:

Appendix 1: Appendix_1_CT_specimens_traits.xlsx	

Description: Spreadsheet file containing list of species CT scanned, including museum collection code, specimen number, where the specimen was scanned, whether the scan is available on MorphoSource, and trait categorizations (habitat, mouth type).

Appendix 2: Appendix_2_R_Data_Files_Scripts.zip	

Description: Zipped folder containing all files and code needed to reproduce results from all R analyses

	 File 1 Name: All_results_Code.RData
	 File 1 Description: The saved R analyses results.

	 File 2 Name: Color_bar.pdf
	 File 2 Description: The color scale used in Figure 4

	 File 3 Name: Final_R_script_all_code.R
	 File 3 Description: The R scripts needed to run all analyses in this paper. 

	 File 4 Name: Local_Superimpositions_code.R 
	 File 4 Description: The source code to run the local superimpositions

	 File 5 Name: Modules_CR_test.csv
	 File 5 Description: The module hypotheses needed for the input to the modulary/integration analyses

	 File 6 Name: Modules_PhyloEMMLi_Input.csv
	 File 6 Description: The module hypotheses needed for the input to the PhyloEMMLi analyses

	 File 7 Name: network.layout.11.csv
	 File 7 Description: Specifies the layout for the network plot figures

	 File 8 Name: PCs_95%.csv
	 File 8 Description: A list of the PC scores (totaling 95% of the overall variance) for each species

	 File 9 Name: phyloEMMLi_output.csv
	 File 9 Description: The output of the PhyloEMMLi analysis.

	 File 10 Name: SIMMAP_trait_data_no_outgroup.csv
	 File 10 Description: Input for the SIMMAP analyses. 

	 File 11 Name: Tetraodontiformes.tre
	 File 11 Description: Phylogeny of tetraodontiform fishes

	 File 12 Name: trait_data.csv
	 File 12 Description: Spreadsheet of all trait data. 

	 Folder 1 Name: BayesTraits
	 Folder 1 Description: Results from output of BayesTraits analysis

	 Folder 2 Name: Landmarks_Extant
	 Folder 2 Description: Landmark coordinates for all extant species

	 Folder 3 Name: Landmarks_Fossils
	 Folder 3 Description: Landmark coordinates for all fossil species

	 Folder 4 Name: Meshes
	 Folder 4 Description: PLY mesh files for skull bones in Figure 4. 

	 Folder 5 Name: Missing_fossils
	 Folder 5 Description: Contains the input files for the missing landmark analysis for the three fossil taxa.

	 Folder 6 Name: Slide_files
	 Folder 6 Description: Slider files for the semisliding landmarks for each of the bones in the skull shape analyses.

3. Usage:

1. Open the "Final_R_script_all_code.R" into R. It is well-annotated. Simply follow all steps in this file if you want to run the analyses yourself

2. Load "All_results_Code.RData" into R. This will bring up all saved results for all analyses if you don't want to run the analyses yourself.

Missing data code: NA

4.Code/Software:

Analyses were run in R version 4.2.3

Taxonomic sampling and CT-scan data acquisition

We analyzed the skull shape of 176 species of Tetraodontiformes, including 173 extant and three fossil species. This sampling encompasses all ten living families, with fossil representatives from Tetraodontidae and Triodontidae. A comprehensive list of the scanned species, scanning locations, and specimen voucher information is provided in Appendix 1 (Supplementary Material). We included three of the only known catalogued three-dimensional fossil tetraodontiform skulls, Sphoeroides hyperostosus† (Tetraodontidae), Triodon antiquus† (Triodontidae) and Ctenoplectus williamsi† (Triodontidae). Each species was represented by a single adult specimen that underwent micro-CT scanning either at the University of Washington Friday Harbor Laboratories (Bruker Skyscan 1173; 40 species), Rice University (Bruker Skyscan 1273; 92 species), the University of New England, Australia (General Electric phoenix v|tome|x s; 13 species), Cornell University (General Electric 120 micro-CT; 1 species), or the University of Michigan (Nikon XT H 225 ST; 1 fossil species), totaling 147 new scans. Two previously scanned fossil specimens were also acquired from Close et al. and an unpublished scan of Triodon antiquus†, both scanned on a Nikon XT H 225 ST at the Natural History Museum, London. Finally, we also sourced scans for 27 additional species from MorphoSource (http://morphosource.org).

Segmentation, digitization, reproducibility, and fossil landmarks

Scans were segmented in 3D Slicer to isolate the skull bones from the rest of the body. Within 3D Slicer, digitization of the specimens was conducted using a landmark scheme of 170 points (48 fixed landmarks and 122 sliding semilandmarks) as detailed in Figure S6 and Table S3. This scheme represents an extended version of the general Eupercaria scheme from Evans et al., ensuring a comprehensive representation of the diversity of tetraodontiform skull shapes. To ensure consistency of landmark placement, all landmarking was conducted by the same person. After landmarking all 176 specimens, each scan was inspected again for verification, with slight adjustments made when necessary. A reproducibility test was performed for quality control by re-landmarking two randomly selected specimens. The original landmarks were then juxtaposed with the re-landmarks by plotting both sets into a phylomorphospace (Figure S7).

All landmarks were placed on the left side of the skull. However, for one fossil specimen, Triodon antiquus†, the left side could not be landmarked due to taphonomic degradation. To address this, the CT scan was converted to a three-dimensional mesh and then inverted for landmarking using the MeshLab software. Additional taphonomic processes affecting our fossil specimens occasionally rendered some landmarks unplaceable. Instead of proceeding with a significantly reduced subset of landmarks shared across all extant and fossil specimens, we chose to estimate the missing landmarks for the fossil specimens using the ‘MissingGeoMorph’ function in the R package LOST⁵⁶ for this purpose. We applied a Bayesian principal component analysis (BPCA) method to estimate missing landmark data, leveraging principal component regressions and Bayesian estimations to determine the position of absent landmarks. Empirical data set analyses by Arbour and Brown have shown this method to be highly reliable for landmark estimations. Moreover, these estimates produced a better fit to the original data than exclusion of specimens with incomplete landmarks.

Skull shape analyses

After digitization, we imported the landmark coordinates into R version 4.2.3 using a custom script from Buser et al. and processed them with the geomorph package version 4.0.5. To remove the effect of non-shape variation, such as scale, size, and orientation across specimens, we performed a generalized Procrustes superimposition between specimens. Semilandmarks were slid along their tangent directions using the Procrustes distance criterion, because sliding using bending energy may cause spurious correlations among landmarks which can bias modularity analyses. Given the biomechanical complexity of fish skulls, which contain multiple moving elements, analyzing shape can be challenging due to preservation artefacts affecting jaw positions. We account for these biases in rotation and translation of these mobile structures by performing a local superimposition, ensuring standardized positioning of the different skull elements. Following local superimposition, we then conducted a principal components analysis (PCA) to assess the main axes of skull shape variation. The first two principal components (PC1 and PC2) were visualized as a phylomorphospace using the pruned, time-calibrated phylogeny of Troyer et. al. Additionally, we employed the ‘plotRefToTarget’ function in geomorph to plot the primary and secondary axes of skull shape variation as ball and stick plots (Figure S1).

Phylogenetic estimation, trait coding, and ancestral trait reconstructions

To investigate skull evolution across Tetraodontiformes, we used the time-calibrated phylogeny proposed by Troyer et al., which encompasses 239 extant and fossil species of Tetraodontiformes. Using the ape package, we pruned the tree to retain only the 176 taxa for which morphological data was available. Habitat associations for extant species were obtained from FishBase and Fishes of Australia. Each species was categorized as being coral reef-associated or non-reef-associated (Appendix 1). For fossil species, categorization was based on the paleobiotope where they were discovered, with reef-association being determined by the presence of hermatypic corals. Dental morphology for each species was also characterized. Species were defined as beaked if they possessed highly modified and fused teeth, a characterization based on the criteria of Tyler²⁶ (Appendix 1). The beaked group consists of all species from the families Molidae, Diodontidae, Tetraodontidae, and Triodontidae. Non-beaked species include those from the families Balistidae, Monacanthidae, Triacanthidae Triacanthodidae, Ostraciidae, and Aracanidae. Non-beaked species possess teeth that are discrete units and protrude out of the jaw sockets, while beaked species possess teeth that do not protrude and are incorporated into the matrix of the jaw bones. For the fossil species in our analysis, their classification as beaked or non-beaked was based on characters 68-70 from the morphological matrix by Santini and Tyler defining beaked species as possessing teeth fused into a parrot-like beak and non-beaked species having teeth as discrete units, either slender caniniform, stoutly conical, incisiform-molariform, or thick caniniform teeth.

For each trait of interest in our analyses (habitat association, mouth type), we reconstructed ancestral states in phytools using stochastic character mapping under an equal rates model with the ‘make.simmap’ function on the complete tree from Troyer et al., containing seven outgroup taxa, 52 fossil tetraodontiforms, and 187 extant tetraodontiforms. Empirical Bayesian posterior probabilities for estimated ancestral states were plotted for each node of the phylogeny.

Rates of skull shape evolution and morphological disparity

We quantified the rate of skull shape evolution between reef and non-reef-associated species, as well as between beaked and non-beaked species using the ‘compare.evol.rates’ function in geomorph. Significance was assessed using the phylogenetic simulation approach run with 1,000 iterations. Similarly, to compare skull morphological disparity, we employed the ‘morphol.disparity’ function in geomorph for both reef vs. non-reef and beaked vs. non-beaked. This function calculates morphological disparity by estimating the Procrustes variance for each group using the residuals of a linear model fit. Additionally, we used the ‘compare.multi.evol.rates’ function in geomorph to evaluate rates of bone module evolution between beaked and non-beaked species. This method calculates evolutionary rate parameters of multivariate traits (σ²) from predefined modules. Significance is assessed by comparing the observed rate to a null rate matrix derived from a random simulation using 1,000 permutations.

To quantify rates of tetraodontiform skull shape evolution across the phylogeny, we used BayesTraitsV4. To reduce dimensionality, we employed only the first 24 principal components, which account for 95% of the total shape variation. PCs were multiplied by 1,000, following Evans et al., because BayesTraits has computational issues with very small numbers. We account for evolutionary correlations in trait variation using the “TestCorrel” function in BayesTraits, which sets trait correlation to zero. We used a reversible-jump Markov chain Monte Carlo method with uninformative priors and ran two independent chains for 200 million generations, sampling every 10,000 iterations, with the first 60 million discarded as burn-in. Convergence was visually assessed using Tracer v1.7.1, with all ESS (Effective Sample Size) values exceeding 200. We tested two models of trait evolution: a single rate Brownian motion model that assumes one rate across the phylogeny, and a variable rates model that allows for changes in rates throughout the phylogeny and identifies where rates differ. Model fitting was performed by calculating Bayes factors based on the marginal likelihoods from both models. A Bayes factor greater than 10 is regarded as strong support for that particular model. To assess changes in subclade morphological disparity for Tetraodontiformes, we implemented disparity-through-time (DTT) analyses under a Brownian motion model using the ‘geiger’ package. We also compare DTT for reef vs. non-reef associated species as well as beaked vs non-beaked species.

We used the average squared Euclidean distance among all pairs of data points as our disparity index. The DTT method calculates changes in relative subclade disparity through time across nodes in the phylogeny. We compared the observed disparity to that under a simulated null Brownian motion model iterated over 1,000 generations. We used the observed and simulated disparities to calculate a morphological disparity index (MDI), which measures the deviation from expectations for relative within-clade disparities under a model of Brownian motion. A negative MDI indicates that disparity is distributed among subclades, and is commonly interpreted as evidence for an early burst, characteristic of an adaptive radiation. A positive MDI indicates that disparity is distributed within subclades. Because MDI estimations at multiple time points suffer from a high false-positive rate, we use the two-tailed rank envelope method of Murrell to assess significance. This method provides an overall p-value as well as a p-interval because the ranks will almost always result in some ties. Because this is a two-tailed test, p-values below 0.025 are considered significant.

Evolutionary modularity and integration

To test for patterns of evolutionary modularity between beaked and non-beaked species, we defined eight a priori hypotheses of modularity that encompass a range of functional, embryological, and sensory hypotheses, as well as an eleven-module individual bone hypothesis (Figure S8). We evaluated modularity using the ‘phylo.modularity’ function in the geomorph package. This function uses the covariance ratio (CR) method, which is a measure of the relative strength of covariation between modules compared to the strength within modules. A CR less than 1 indicates a more modular system, while a CR greater than 1 indicates less modularity. Then, under the best-supported hypothesis, we compared the effect sizes (strength of the modular signal) for beaked and non-beaked species using the ‘compare.CR’ function in geomorph. The best-supported model is indicated by the lowest effect size. Additionally, we evaluated our eight hypotheses of modularity with ‘phyloEMMLi’, which applies maximum likelihood to compare different modularity hypotheses while also accounting for phylogenetic non-independence. We visualized the results of the modularity analyses by creating network plots showing the magnitude of integration between each module.

Using our best-supported modular hypothesis, we tested evolutionary integration among modules using the ‘phylo.integration’ function in geomorph. This method uses partial least squares (PLS) analysis to quantify the degree of covariation between our hypothesized modules. PLS values closer to 1 indicate higher integration. Because this method can be sensitive to sample size between groups, we first randomly removed 42 non-beaked species until both groups contained 67 species. Finally, we compared effect sizes between groups using ‘compare.PLS’ in the geomorph package.

Data from: Evolutionary innovation accelerates morphological diversification in pufferfishes and their relatives

Data files

Abstract

README: Evolutionary innovation accelerates morphological diversification in pufferfishes and their relatives

Methods

Works referencing this dataset