Post-larval processes reduce the diversity of coral reef fish communities
Data files
Dec 24, 2024 version files 1.24 GB
-
Adult_community_matrix_-_multiplicative_non-corrected.csv
46.94 KB
-
Adult_Pre_community_matrix.csv
53.49 KB
-
Adults_additive_data_corrected.RData
28.35 KB
-
Adults_additive_data_non-corrected.RData
30.84 KB
-
Adults_community_matrix_multiplicative_corrected.csv
48.28 KB
-
Adults_sites_data_multiplicative.csv
7.18 KB
-
Adults_sites_data_multiplicative.xlsx
15.36 KB
-
Cryptic_abundances_for_coverage.csv
1.86 KB
-
Data_table.csv
27.20 KB
-
Data_table.xlsx
40.82 KB
-
FD_Results.RData
14.15 KB
-
FS_results.zip
1.24 GB
-
Larvae_community_matrix.csv
143.02 KB
-
Larvae_sites_data.csv
18.87 KB
-
Larvae_sites_data.xlsx
23.63 KB
-
Poisson_log_normal_-_bi-variate.R
15.06 KB
-
README.md
6.85 KB
-
Traits_analyses_data.csv
27.11 KB
-
Traits_analyses_data.xlsx
37.32 KB
Abstract
Here we provide code and data for: Post-larval processes reduce the diversity of coral reef fish communities. The code and data provided can be used to generate the results included in both the main text and in the supplementary information. In our paper, we couple species-level community-wide abundance estimates for larval reef fishes with adult abundance data from intensive visual surveys and species-level traits to elucidate the factors that shape the diversity of adult coral reef fishes. We show that while larval supply is important in determining adult taxonomic and functional diversity, post-larval processes increase the numerical dominance of particular species, thus reducing overall diversity.
README: Post-larval processes reduce the diversity of coral reef fish communities
These data include species-level abundance estimates of larvae and adult coral reef fish, along with ecological and life-history traits. They are used to examine the relationship between larvae and adult diversity and how adult diversity is maintained.
General description:
Description of the data and files structure
The data is composed of several files as the data structure needed for analyses is different and several subsets are required.
The file Data_table.xlsx
includes species abundances of larvae and adults of different data subsets and species traits, and should be considered as the 'Supporting Data' file associated with the publication. It includes a README tab. A CSV version is also available.
The files Adult_community_matrix_-_multiplicative_non-corrected.csv
and Adults_community_matrix_multiplicative_corrected.csv
are community matrix of adults, containing multiplicative, non-corrected abundances and multiplicative, detectability-corrected abundances, respectively. Both files include a first column containing a unique identifier character vector for each transect ('trans_id' for the non-corrected data and 'model_site' for the corrected data), which are a composite of the sampling site, dive-transect identifier, surveyor names, sampling month and sampling year.
The file Adult_community_matrix_-_multiplicative_non-corrected.csv
also includes the 'depth.m.' column, which represents the average depth (in meters) of each transect. In both files, all additional columns represent species abundances.
The file Adults_additive_data_corrected.RData
includes adults additive abundances, for 100 randomly drawn 'communities'.
The file Larvae_community_matrix.csv
is a community matrix of larvae abundances. The first column contains a unique identifier character vector for each net ('sample_id', a composite of sampling date, sampling site, sampling site maximum bottom depth, the depth layer sampled, and the net identifier number), while all additional columns represent species abundances.
The file Larvae_sites_data.xlsx
includes the coordinates where samples were collected along with a README tab. A CSV file is also available.
The file Adults_sites_data_multiplicative.xlsx
includes the coordinates of each adult sample along with a README tab. A CSV file is also available.
The file Traits_analyses_data.xlsx
includes abundance and traits data used for the two traits analyses: Functional diversity and abundance-traits models. It includes a README tab. A CSV version is also available.
All files of Adults_FS...RData
and Larvae_FS...RData
which are located in the zipped folder FS_results
contain the results of the feasible sets analyses. They are also used to plot SADs and RADs.
The file FD_Results.RData
contains the results of the computationally heavy function used to calculate adults and larvae functional diversity.
The file Adult_Pre_community_matrix.csv
contains the community matrix of the adult data collected prior to larval sampling. Each row in the data represent a sample.
Code
All analyses were conducted in R software.
#Species abundance distributions (SADs) and Rank abundance (RADs):
SADs_RADs_All_included_non-corrected_adults
SADs_RADs_All_included_Detectability_corrected
These codes produce the SADs and RADs plots of the observed larvae and adult abundances along with the central tendencies abundance distributions of each of them (the most likely SAD derived from the feasible set of SADs, see below). The code SADs_RADs_All_included_non-corrected_adults
can also be used to produce the RADs of the adults data collected prior to larval sampling in comparison to the data collected post larval sampling (the data used throughout the analyses). To run these codes use the Data_table.csv
file and the files within the FS_results
folder, according to the desired data subset (see further explanation within the code).
#Feasible set analysis: 3 codes are used to produce the feasible set (FS) of SADs and compare the central tendency SADs (the most likely SAD derived from the feasible set of SADs) to the observed SADs:
FS_adults
- used to produce the adults FS of the detectability-corrected abundance data.FS_larvae
- used to produce the larval FS.FS_adults_non-detectability-corrected
- used to produce the adults FS of the original, non-detectability-corrected abundance data.
The data needed for these codes is in the Data_table.csv
file. Note that the code FS_adults
is computationally heavy due to the high number of individuals and species.
#Poisson log-normal models:
Poisson_log_normal_-_uni-variate
Poisson_log_normal_-_bi-variate
These codes fit uni-variate and a bi-variate Poisson log-normal models to the different data subsets. The data files needed for these codes are the Data_table.csv
, Adults_additive_data_corrected.RData
and Adults_additive_data_non-corrected.RData
.
#Functional diversity: This code (Functional_diversity
) computes larvae and adults functional diversity and the coverage of the data, based on the*iNEXT3D
* package. The data files needed are Cryptic_abundances_for_coverage.csv
, Adult_community_matrix_-_multiplicative_non-corrected.csv
, Data_table.csv
and Traits_analyses_data.csv
. Since the function that computes the functional diversity is computationally heavy, the file FD_Results.RData
, which contains the results, is also supplied.
#Species traits and the abundance of larvae and adults:
Adults_abundance-traits_models
Larvae_abundance-traits_models
These codes fit Bayesian linear models to species abundances and traits in order to examine their roles as determinants of larvae and adults abundances. The data file needed is Traits_analyses_data.csv
.
3. Traits_life_stage_models
This code fits mixed-effects linear models that test the relationship between species relative abundance at both life stages to species traits. The data file needed is Traits_analyses_data.csv
.
#Adults_depths_niche_models
: This code fits Huisman–Olff–Fresco (HOF) models for each species to estimate the central depth niche. The data needed for this code is Adult_community_matrix_-_multiplicative_non-corrected.csv
and Adults_community_matrix_multiplicative_corrected.csv
.
#MobR
code - Species richness determinants: This code uses the 'mobr' package to examine the sources of differences in species richness between the adult and larval communities across spatial scales. Here, both abundance and geographical data is required: Adults_sites_data_multiplicative.csv
, Adults_community_matrix_multiplicative_corrected.csv
, Larvae_sites_data.csv
, Larvae_community_matrix.csv
.
Methods
Data collection:
Larvae: The sampling and identification procedures are described in detail in Kimmerling et al. 2018 and are reported here in brief. Larvae were collected between 2010-2011 either twice a month (July-November 2010) or monthly (December 2010-May 2011). Larvae were collected using a 1 m2 Multiple Opening and Closing Net and Environmental Sensing System (MOCNESS), mounted with 600 μm nets. The MOCNESS system was towed obliquely from a maximal depth of 180 m to the surface, at a speed of ~2 knots. Each net was opened for 5 minutes and then closed to sample larvae in the following depth strata: 180–140, 140–100, 100–75, 75–50, 50–25 and 25–0 m. These sampling transects ran parallel to the shoreline, over bottom depths of ~70, ~170, ~250 and ~500 m. Between 5-10 casts were made within each sampling day. Samples were instantly preserved on board in ethanol. In the laboratory, fish larvae were separated from the bulk zooplankton for subsequent morphological and molecular identification. Species-level quantitative estimates were obtained using a Meta-genomic larval Identification and Abundance method (MIA ,Kimmerling et al. 2018). MIA is based on Illumina sequencing of non-amplified DNA sequences performed on bulk larval samples after taking silhouette images of all the individuals within them. The number of individuals from each species was estimated using a “bin packing” algorithm, which estimated the maximum likelihood solution for the number of reads coming from N larvae of given mass (Kimmerling et al. 2018). For our analysis, we used only data for larvae of reef associated species (see below). From the original dataset, 40 nets did not include such larvae and were omitted from our analysis. Overall, we analyzed 331 nets, featuring 4,100 species-level identified individuals of 194 coral reef associated species.
Adult: Adult abundances (fish x m2) were recorded by underwater visual fish surveys conducted using SCUBA along the Israeli coast of the GoA across shallow coral reefs (depth range 4-25 m). Surveys were conducted during several expeditions between 2018-2020 (July 2018, May, June and December 2019, June and October 2020), using the ‘Belt transect’ method by divers that are trained and proficient in fish visual identification. Fish were identified to the species level and their lengths (TL; total length) were estimated. 100 transects were conducted in total, yielding 22,448 individuals belonging to 185 species. For each transect, two passes were made. First, non-cryptic species were counted along 25x5m transects, followed by a thorough slow search for cryptic species along 25x1m transects. Enumerating cryptic species requires active search, therefore these species were surveyed separately, along a narrower transect. We employed two methods to address the variation in the sampled area between cryptic and non-cryptic species. First, we used an additive approach, where for each non-cryptic transect (125m2) we randomly chose five cryptic transects, without replacement, within the same site, to produce one sampling unit of 125m2. We repeated these random draws 100 times. The mean total adult abundance across all random draws was 367,964 (±20,243) individuals, belonging to 185 species, using detectability-corrected abundances. Second, we account for the different areas sampled by multiplying the number of detectability-corrected cryptic individuals by five. This resulted in a total of 361,391 individuals belonging to 185 species using the detectability-corrected abundances, and 48,580 individuals using the original data. We ran the analyses using both the ‘additive’ and the ‘multiplicative’ data and obtained very similar results. For simplicity, we used the ‘multiplicative’ data for all the analyses presented in the main text except the Poisson-Lognormal analysis (see Methods in the main text).
Analyses summary:
Comparing the species abundance distributions of larvae and adults
To compare the SADs of the larvae and the adults, we fitted a bivariate Poisson-lognormal (PLN) distribution to the larval and adult communities using the bipoilogMLE function (package poilog, Grøtan & Engen, 2022).
Species abundance distribution feasible sets
We examined the differences between the observed SADs to the most likely SAD given the number of individuals and species following Locey & White, 2013. We created the set of 5,000 possible SADs, separately for larvae and adults, using the feasiblesads package (Diaz et al. 2021). From the set of feasible SADs we derived the Central Tendency SAD (‘CT-SAD’, i.e., the most likely SAD, which is the most similar to all other possible SADs).
Species richness determinants in larvae and adults
To identify and explain differences in species richness between larvae and adults we used rarefactions. Specifically, to tease apart the effects of evenness, spatial aggregation, and the number of individuals on richness we employed individual, sample and spatially based rarefactions using the mobr package (McGlinn et al. 2018, 2021).
Observed and expected adult functional diversity
To test for species filtering, we tested how adult functional diversity (FD) would have appeared if the adult community had been structured based on random samples of the larval assemblage. If observed adult FD is lower than the FD expected at random, it is likely that post-larval filtering processes are constraining adult community diversity. To test this hypothesis, we used Hill-Chao numbers (Chao et al. 2014) as implemented in the iNEXT3D package (Chao et al. 2021). We compareFD of adults and larvae based on four species-level traits that characterize the adult life stage: diet, maximum length, home range and activity time. We chose this combination of adult traits as it captures meaningful aspects of reef fish ecological functions. We extracted species-level maximum length from Fishbase (using rfishbase, (Boettiger et al. 2012; Froese & Pauly 2024)), and data on diet, home range and activity from primary literature (see Supplementary information). For several species (less than 10%), diet, home range and activity information were completed based on genus information.
Species traits as determinants of larvae and adult abundances
To understand whether life history traits play a role in determining adult and larvae relative abundances, we used traits collected for the FD analyses (see above), and added traits associated with the planktonic phase: (a) Pelagic Larval Duration (PLD) and (b) egg size. These were compiled from peer-reviewed studies, books, and reports (see Supplementary information). Traits related to earlier life stages are frequently missing. In these cases, we used average genus-level information. On a few occasions, we used family-level data. Following data compilation, we used Bayesian linear models to separately analyze the relationships between adult and larval abundances with life history traits and the abundance of the opposite life stage. To test the impact of adult traits, early life-stage traits, or the abundance of the opposite life stage we fitted several models that include different combinations of these predictor groups. Due to sample size limitation, we could only include a limited number of predictor variables. We chose two adult traits – diet (6 levels) and maximum length, which have been repeatedly shown to be key ecological traits, and two planktonic phase traits – PLD and egg size, along with the abundance of the opposite life stage. We included family as a random effect to account for phylogenetic non-independence. Models were fitted with the brms package (Bürkner 2017) using a negative binomial distribution to account for over-dispersion. As genus level data of early life stage traits were sometimes missing, we used ‘on the fly’ data imputation (during model fitting, van Buuren & Groothuis-Oudshoorn 2011) using diet, maximum length and parental care (data which were collected per family, see Supplementary information).
All analyses were conducted in R software.
References:
Boettiger, C., Lang, D.T. & Wainwright, P.C. (2012). Rfishbase: Exploring, manipulating and visualizing FishBase data from R. J. Fish Biol., 81, 2030–2039.
Bürkner, P.C. (2017). brms: An R package for Bayesian multilevel models using Stan. J. Stat. Softw., 80, 1–28.
Chao, A., Gotelli, N.J., Hsieh, T.C., Sander, E.L., Ma, K.H., Colwell, R.K., et al. (2014). Rarefaction and extrapolation with Hill numbers: A framework for sampling and estimation in species diversity studies. Ecol. Monogr., 84, 45–67.
Chao, A., Henderson, P.A., Chiu, C.H., Moyes, F., Hu, K.H., Dornelas, M., et al. (2021). Measuring temporal change in alpha diversity: A framework integrating taxonomic, phylogenetic and functional diversity and the iNEXT.3D standardization. Methods Ecol. Evol., 12, 1926–1940.
Diaz, R.M., Ye, H. & Ernest, S.K.M. (2021). Empirical abundance distributions are more uneven than expected given their statistical baseline. Ecol. Lett., 24, 2025–2039.
Froese, R. & Pauly, D. (2024). FishBase. Available at: www.fishbase.org. Last accessed 13 August 2024.
Grøtan, V. & Engen, S. (2022). poilog: Poisson Lognormal and Bivariate Poisson Lognormal Distribution. R package version 0.4.2.
Kimmerling, N., Zuqert, O., Amitai, G., Gurevich, T., Armoza-Zvuloni, R., Kolesnikov, I., et al. (2018). Quantitative species-level ecology of reef fish larvae via metabarcoding. Nat. Ecol. Evol., 2, 306–316.
Locey, K.J. & White, E.P. (2013). How species richness and total abundance constrain the distribution of abundance. Ecol. Lett., 16, 1177–1185.
McGlinn, D.J., Xiao, X., May, F., Gotelli, N.J., Engel, T., Blowes, S.A., et al. (2018). Measurement of Biodiversity (MoB): A method to separate the scale-dependent effects of species abundance distribution, density, and aggregation on diversity change. Methods Ecol. Evol., 10, 258–269.
McGlinn, D.J., Engel, T., Blowes, S.A., Gotelli, N.J., Knight, T.M., McGill, B.J., et al. (2021). A multiscale framework for disentangling the roles of evenness, density, and aggregation on diversity gradients. Ecology, 102, e03233.
van Buuren, S. & Groothuis-Oudshoorn, K. (2011). mice: Multivariate imputation by chained equations in R. J. Stat. Softw., 45, 1–67.