Data from: Evidence for general sizebyhabitat rules in actinopterygian fishes across nine scales of observation
Data files
Jun 07, 2021 version files 75.47 MB

Analysis_files.zip

Appendix_1.xlsx

Appendix_10__FB_11k_Troph.pdf

Appendix_11__CoF_11k_Troph.pdf

Appendix_12__FB_31k_Troph.pdf

Appendix_13__CoF_31k_Troph.pdf

Appendix_14__fb_11k_SizeVar.pdf

Appendix_15__CoF_11k_SizeVar.pdf

Appendix_16__fb_31k_SizeVar.pdf

Appendix_17__CoF_31k_SizeVar.pdf

Appendix_2__fb_11k_Size.pdf

Appendix_3__CoF_11k_Size.pdf

Appendix_4__fb_31k_Size.pdf

Appendix_5__CoF_31k_Size.pdf

Appendix_6__FB_11k_tSize.pdf

Appendix_7__CoF_11k_tSize.pdf

Appendix_8__FB_31k_tSize.pdf

Appendix_9__CoF_31k_tSize.pdf

Appendix_Legends.pdf

SI_Figures_S1S15_Clarke_2021.pdf

SI_Text_Clarke_2021.pdf

Table_S5_Clarke_2021.xlsx

Table_S6_Clarke_2021.xlsx

Tables_S1S4_Clarke_2021.pdf
Abstract
Identifying environmental predictors of phenotype is fundamentally important to many ecological questions, from revealing broadscale ecological processes to predicting extinction risk. However, establishing robust environment—phenotype relationships is challenging, as powerful case studies require diverse clades which repeatedly undergo environmental transitions at multiple taxonomic scales. Actinopterygian fishes, with 32000+ species, fulfil these criteria for the fundamental habitat divisions in water. With four datasets of body size (ranging 10905–27226 species), I reveal highly consistent sizebyhabitatuse patterns across nine scales of observation. Taxa in marine, marinebrackish, euryhaline and freshwaterbrackish habitats possess larger mean sizes than freshwater relatives, and the largest mean sizes consistently emerge within marinebrackish and euryhaline taxa. These findings align with the predictions of seven mechanisms thought to drive larger size by promoting additional trophic levels. However, mismatches between size and trophiclevel patterns highlight a role for additional mechanisms, and support for viable candidates is examined in 3439 comparisons.
Usage notes
Note: Many of these files also appear as supplementary files on the journal website. This provides an opportunity to provide all files associated with the paper in one place, alongside expanded descriptions of all files so that they are easier to navigate.
SI Text
Supplementary methods, results, and discussion.
* SI Text Clarke 2021.pdf
SI Figures S1S15
All 15 SI figures with captions.
* SI Figures S1S15 Clarke 2021.pdf
Fig. S1: Size distributions (log10 scale) for taxa in each habitat use across four datasets: (a) ‘FB11k dataset’; (b) ‘CoF11k dataset’; (c) ‘FB31k dataset’; (d) ‘CoF31k dataset’.
Fig. S2: Corresponding plot to main text Fig. 1 using FishBase 31k tree dataset.
Fig. S3: The percentage of groups where the phylogenetic mean size of taxa for one habitat use is larger than the other, obtained for every pairwise habitatuse comparison within all four datasets (FB11k, CoF11k, FB31k and CoF31k tree datasets).
Fig. S4: The percentage of groups where the observed log10 mean size of taxa for one habitat use is larger than the other, obtained for every pairwise habitatuse comparison within all four datasets (FB11k, CoF11k, FB31k and CoF31k tree datasets).
Fig. S5: The percentage of groups where the size variance of taxa within one habitatuse category is greater than the other, obtained for every pairwise habitatuse comparison CoF31k tree dataset. Three different ways of comparing size variance are assessed in panels (a), (b) and (c).
Fig. S6: The relationship between the magnitude of body size difference between two habitats (measured as phylogenetic effect size) and the magnitude of trophic level difference between two habitats (measured as phylogenetic effect size) for all ten pairwise habitat comparisons conducted in the study.
Fig. S7: The relationship between the magnitude of body size difference between two habitats (measured as phylogenetic effect size) and the magnitude of mean branch length duration difference between two habitats for all ten pairwise habitat comparisons conducted in the study.
Fig. S8: The relationship between the magnitude of body size difference between two habitats (measured as phylogenetic effect size) and the magnitude of log10 richness difference between two habitats for all ten pairwise habitat comparisons conducted in the study.
Fig. S9: Size distributions (log scale) for fossil taxa in each fossil habitat type, using data from Clarke et al. 2016 and Clarke & Friedman 2018.
Fig. S10: Size distributions (log10 scale) of taxa with maximum length and common length measures in each habitatuse across eight datasets. See SI text for details on how these datasets were derived and compared.
Fig. S11: The corresponding statistical values and clade information for Fig. S5c. For each pairwise habitatuse comparison across multiple taxonomic scales, this indicates the number of times each habitatuse possesses taxa with the largest size variance (relative to simulations) at probabilities of < 0.1 and < 0.05. Dark shades of each colour represent p < 0.05, lighter shades p = 0.1–0.05, and grey p > 0.1. Data from CoF31k tree dataset.
Fig. S12: The corresponding statistical values and clade information for Fig. S13b. For each pairwise habitatuse comparison across multiple taxonomic scales, this indicates the number of times each habitat use possesses taxa with the larger phylogenetic mean size at probabilities of < 0.1 and < 0.05 using PGLS ANOVA. Dark shades of each colour represent p < 0.05, lighter shades p = 0.1–0.05, and grey p > 0.1. Data from CoF31k tree dataset.
Fig. S13: (a) For every pairwise habitatuse comparison at the taxonomic scale of order, this indicates the percentage of orders in which raw and phylogenetic means are larger for one habitatuse than the other, and whether any of these size differences occurred with probabilities of < 0.1 or < 0.05 according to three statistical tests. (b) For each pairwise habitatuse comparison across multiple taxonomic scales, this indicates the number of times each habitatuse possesses taxa with the larger phylogenetic mean size at probabilities of < 0.1 and < 0.05 using PGLS ANOVA. Dark shades of each colour represent p < 0.05, lighter shades p = 0.1–0.05, and grey p > 0.1. For definitions of taxonomic scales, see methods.
Fig. S14: The corresponding statistical values and clade information for Fig. S13a. For every pairwise habitatuse comparison at the taxonomic scale of order, this indicates the number of orders in which raw and phylogenetic means are larger for one habitatuse than the other, and whether any of these size differences occurred with probabilities of < 0.1 or < 0.05 according to three statistical tests. Dark shades of each colour represent p < 0.05, lighter shades p = 0.1–0.05, and grey p > 0.1. Data from CoF31k tree dataset.
Fig. S15: The corresponding statistical values and clade information for Fig. 2a. The numbers of groups where the phylogenetic mean size of taxa for one habitatuse is larger than the other, obtained for every pairwise habitatuse comparison across multiple taxonomic scales. Data from CoF31k tree dataset.
SI Tables S1S6
All 6 SI tables with captions.
* Tables S1S4 Clarke 2021.pdf
* Table S5 Clarke 2021.xlsx
* Table S6 Clarke 2021.xlsx
Table S1: List of mechanisms discussed in the main text that are proposed to explain the sizebyhabitat patterns.
Table S2: The percentage of clades (Tax3 scale) in which each pair of metrics, from the nine metrics compared between habitats, were aligned. For example, if comparing size and richness outcomes for euryhaline vs. freshwater comparisons (top row, output in red), the percentage of alignments will equal the percentage of clades in which, relative to the total number of Tax3 clades in which the two habitat types could be compared, euryhaline taxa possessed either i) the smaller mean size and lower species richness, or ii) the larger mean size and higher species richness. Cumulatively, these two outcomes occurred in 18.2% of comparisons. I commonly refer to these as ‘percentage alignments’ of discrete outcomes.
Table S3: Numbers and percentages of migratory taxa within each habitatuse type across the four datasets. Illustrates relatively high percentages of migratory taxa within the euryhaline category.
Table S4: Numbers and percentages of migratory taxa in every order and habitat subdivision for the CoF 31ktree dataset.
Table S5: A summary of support for various suites of mechanisms (defined A through E) presented in Table S6 for all comparisons in each of the four datasets analysed (FB11k, CoF11k, FB31k, CoF 31k). Text cites CoF31k summary percentages.
Table S6: Indication of whether various suites of mechanisms (defined A through E) can, or cannot be supported for every individual comparison performed in this study. 1 indicates support. A list is provided for each of the four datasets analysed (FB11k, CoF11k, FB31k, CoF 31k). Text cites CoF31k outcomes.
Appendices 117
All 17 Appendices with captions in a separate pdf (* Appendix Legends.pdf) with shortened captions below.
* Appendix.1.xlsx
* Appendix 2  FB 11k Size.pdf
* Appendix 3  CoF 11k Size.pdf
* Appendix 4  FB 31k Size.pdf
* Appendix 5  CoF 31k Size.pdf
* Appendix 6  FB 11k tSize.pdf
* Appendix 7  CoF 11k tSize.pdf
* Appendix 8  FB 31k tSize.pdf
* Appendix 9  CoF 31k tSize.pdf
* Appendix 10  FB 11k Troph.pdf
*Appendix 11  CoF 11k Troph.pdf
* Appendix 12  FB 31k Troph.pdf
* Appendix 13  CoF 31k Troph.pdf
* Appendix 14  FB 11k Var.pdf
* Appendix 15  CoF 11k Var.pdf
* Appendix 16  FB 31k Var.pdf
* Appendix 17  CoF 31k Var.pdf
Appendix 1: The percentage of clades in which each pair of metrics, from the nine metrics compared between habitats, were aligned (e.g. the percentage of orders where euryhaline taxa possessed the smallest mean size and lower species richness, compared to freshwater relatives. An order where euryhaline taxa possessed the larger mean size and higher species richness also represents an alignment). I commonly refer to these as ‘percentage alignments’ of discrete outcomes. Clades whose metrics are aligned for a given habitat comparison fall within the white quadrants of Figure 4, while mismatched outcomes fall within grey quadrants.
Appendices 2 to 17 display all individual comparisons of size, trophic level, and size variance performed in the study. Grid cells in these plots contain statistical details, so please increase magnification on the pdfs to view these details.
These appendices provide full record of these results, so the reader can find an outcome for their clade of interest, with the data source, phylogeny, and analytical method they prefer.
Size results (largest possible dataset)
Comparisons of log10 body size using all taxa for which size data is available. Across the four datasets, the analyses represent a combined total of 5232 pairs of group + habitat comparisons (each of which were compared with five methods): The five methods are: 1. Observed log10 means; 2. Phylogenetic log10 means; 3. Wilcoxon test outcomes; 4. Simulation ANOVA test outcomes; 5. PGLS ANOVA test outcomes.
Appendix 2: All analyses pertaining to comparisons of taxon size between habitatuse types for the FishBase 11ktree dataset.
Appendix 3: All analyses pertaining to comparisons of taxon size between habitatuse types for the Catalogue of Fishes 11ktree dataset.
Appendix 4: All analyses pertaining to comparisons of taxon size between habitatuse types for the FishBase 31ktree dataset.
Appendix 5: All analyses pertaining to comparisons of taxon size between habitatuse types for the Catalogue of Fishes 31ktree dataset.
Size results (reduced and retained size + trophic level datasets; see Methods and SI test Methods)
Comparisons of log10 body size using all taxa in the reduced and retained size + trophic level datasets. Across the four datasets, the analyses represent a combined total of 3439 pairs of group + habitat comparisons (each of which were compared with five methods): The five methods are: 1. Observed log10 means; 2. Phylogenetic log10 means; 3. Wilcoxon test outcomes; 4. Simulation ANOVA test outcomes; 5. PGLS ANOVA test outcomes.
Appendix 6: All analyses pertaining to comparisons of taxon size (in the reduced and retained size datasets, see Methods) between habitatuse types for the FishBase 11ktree dataset.
Appendix 7: All analyses pertaining to comparisons of taxon size (in the reduced and retained size datasets, see Methods) between habitatuse types for the Catalogue of Fishes 11ktree dataset.
Appendix 8: All analyses pertaining to comparisons of taxon size (in the reduced and retained size datasets, see Methods) between habitatuse types for the FishBase 31ktree dataset.
Appendix 9: All analyses pertaining to comparisons of taxon size (in the reduced and retained size datasets, see Methods) between habitatuse types for the Catalogue of Fishes 31ktree dataset.
Trophic level results (reduced and retained size + trophic level datasets; see Methods and SI test Methods)
Comparisons of log10 trophic level using all taxa in the reduced and retained size + trophic level datasets. Across the four datasets, the analyses represent a combined total of 3439 pairs of group + habitat comparisons (each of which were compared with five methods): The five methods are: 1. Observed log10 means; 2. Phylogenetic log10 means; 3. Wilcoxon test outcomes; 4. Simulation ANOVA test outcomes; 5. PGLS ANOVA test outcomes.
Appendix 10: All analyses pertaining to comparisons of taxon trophic level between habitatuse types for the FishBase 11ktree dataset.
Appendix 11: All analyses pertaining to comparisons of taxon trophic level between habitatuse types for the Catalogue of Fishes 11ktree dataset.
Appendix 12: All analyses pertaining to comparisons of taxon trophic level between habitatuse types for the FishBase 31ktree dataset.
Appendix 13: All analyses pertaining to comparisons of taxon trophic level between habitatuse types for the Catalogue of Fishes 31ktree dataset.
Size variance results (largest possible dataset)
Comparisons of log10 body size variance using all taxa for which size data is available. Across the four datasets, the analyses represent a combined total of 5232 pairs of group + habitat comparisons (each of which were compared with four methods): The four methods are: 1. Observed log10 variance; 2. Expected log10 variance from simulations; 3. Observed variance vs. simulated variance; 4. P values derived from observed variance vs. simulated variance.
Appendix 14: All analyses pertaining to comparisons of size variance between habitatuse types for the FishBase 11ktree dataset.
Appendix 15: All analyses pertaining to comparisons of size variance between habitatuse types for the Catalogue of Fishes 11ktree dataset.
Appendix 16: All analyses pertaining to comparisons of size variance between habitatuse types for the FishBase 31ktree dataset.
Appendix 17: All analyses pertaining to comparisons of size variance between habitatuse types for the Catalogue of Fishes 31ktree dataset.
Analysis files (Analysis files.zip)
Input datasets and phylogenies for analyses.
Datasets:
Information regarding the species, size data, and trophic level data used for any specific habitat comparison for any group of taxa compared at any scale of observation can be found in the datasets below.
For size analyses drawing data from the largest possible size dataset (i.e. all those conducted in Appendices 25 and 1417) using scales of observation that concern nonevolutionary hotspot scales of observation (Fam, Ord, Tax 3, Tax 4, Tax 5, Tax 6, Full dataset), the following datasets provide all the neccesary information, depending on your choice of dataset (CoF or FishBase) and tree (11k molecular tree or 31k supertree):
* Rab18tax.Order log10.TL 30K CoF.dataset_SI.data.csv * Rab18tax.Order log10.TL 30K fb.dataset_SI.data.csv
* Rab18tax.Order log10.TL 12Kspec CoF.dataset_SI.data.csv * Rab18tax.Order log10.TL 12Kspec fb.dataset_SI.data.csv
For comparisons of the groups and habitats within hotspot analyses, the analagous information is provided in:
* Hotspots.only log10.TL 30K CoF.dataset_SI.data.csv * Hotspots.only log10.TL 30K fb.dataset_SI.data.csv
* Hotspots.only log10.TL 12Kspec CoF.dataset_SI.data.csv * Hotspots.only log10.TL 12Kspec fb.dataset_SI.data.csv
* Clarke.pot.grps log10.TL 30K CoF.dataset_SI.data.csv * Clarke.pot.grps log10.TL 30K fb.dataset_SI.data.csv
* Clarke.pot.grps log10.TL 12Kspec CoF.dataset_SI.data.csv * Clarke.pot.grps log10.TL 12Kspec fb.dataset_SI.data.csv
For any analyses drawing data from the reduced and retained size + trophic level datasets (e.g. Appendices 613, Figures 2b, 3 and 4 in main text) using scales of observation that concern nonevolutionary hotspot scales of observation (Fam, Ord, Tax 3, Tax 4, Tax 5, Tax 6, Full dataset), the following datasets provide all the neccesary information, depending on your choice of dataset (CoF or FishBase) and tree (11k molecular tree or 31k supertree):
* Rab18tax.Order log10.TL.ShrdW.troph 30K CoF.dataset_SI.data.csv * Rab18tax.Order log10.TL.ShrdW.troph 30K fb.dataset_SI.data.csv
* Rab18tax.Order log10.TL.ShrdW.troph 12Kspec CoF.dataset_SI.data.csv * Rab18tax.Order log10.TL.ShrdW.troph 12Kspec fb.dataset_SI.data.csv
For comparisons of the groups and habitats within hotspot analyses, the analagous information is provided in:
* Hotspots.only log10.TL.ShrdW.troph 30K CoF.dataset_SI.data.csv * Hotspots.only log10.TL.ShrdW.troph 30K fb.dataset_SI.data.csv
* Hotspots.only log10.TL 12Kspec CoF.dataset_SI.data.csv * Hotspots.only log10.TL 12Kspec fb.dataset_SI.data.csv
* Clarke.pot.grps log10.TL.ShrdW.troph 30K CoF.dataset_SI.data.csv * Clarke.pot.grps log10.TL.ShrdW.troph 30K fb.dataset_SI.data.csv
* Clarke.pot.grps log10.TL 12Kspec CoF.dataset_SI.data.csv * Clarke.pot.grps log10.TL 12Kspec fb.dataset_SI.data.csv
Phylogenies:
* actinopt_12k_treePL.tre  The single molecular data derived tree provided in Raboksy et al. 2018.
* actinopt_full.trees.tre  The 100 supertrees provided in Raboksy et al. 2018.
Access
The paper is available on request from the author.