Data from: Species that dominate spatial turnover can be of (almost) any abundance
Data files
Jan 30, 2025 version files 2.23 MB
-
metadata_V2.csv
54.11 KB
-
moddat_V2.csv
1.27 MB
-
nulldat.csv
901.48 KB
-
README.md
6.87 KB
Abstract
An ongoing quest in ecology is understanding how species commonness influences compositional change. While each species’ contribution to beta diversity (SCBD) depends both on its abundance and how widespread it is (e.g., occupancy) a general expectation for these influences is lacking. Using published data for 9924 species across 177 metacommunities, we modeled relative SCBD as a function of abundance and occupancy using both correlative and mechanistic regression models (the latter derived from population demographic theory). Although the correlative model provided a superior fit to the data, both results suggest it is infrequent (high abundance and mid-high occupancy) species that make the dominant contribution to beta diversity. The nature of their interaction is most apparent when depicted in abundance-occupancy sample space, which shows the probability of making a dominant contribution to beta diversity is a concave-up function of abundance. Species found in an intermediate number of sites (0.56) required the smallest share of total abundance (0.05) to make a top-decile contribution. The abundance-occupancy sample space illustrates how empirical abundance-SCBD relationships can be linear or unimodal and provides a general framework to understand global change processes. To preserve compositional turnover, species of infrequent abundance and occupancy should be prioritized.
README: Data from: Species that dominate spatial turnover can be of (almost) any abundance
https://doi.org/10.5061/dryad.5dv41nsfs
Description of the data and file structure
Summary of experimental efforts underlying this dataset
All data used are observational ecological studies recording the abundance (number of individuals) of species within a defined ecological community (e.g., 'woodland birds') from multiple sampling sites (i.e., different locations). In total, 177 separate sites x species matrices are included, from 117 different study systems.
The manuscript analyses the data by first quantifying the contribution made by each species to changes in species composition across all sites in that dataset using the 'species contribution to beta diversity' (SCBD) metric of Legendre and de Caceres (2013). It then relates this contribution by each species to its relative abundance (the fraction of total individuals that species represents across all species and all sites) and its occupancy (the number of sites that species was observed in).
Literature cited
Legendre, P., and M. De Caceres. 2013. Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. Ecology Letters 16:951-963.
Files and variables
File: metadata_V2.csv
Description: metadata for the source publication and broad ecological context from which each dataset was collected, with fields
Variables
- sort: original sort order.
- filename: unique dataset identifier ('source' in moddat.csv, nulldat.csv and names for each element in the list dflst_177.RData)
- description: brief notes on ecological context
- taxa: broad taxonomic group. With levels: birds, fish, herp (herpetofauna), inv (all other invertebrates), macroinvertebrates (freshwater aquatic invertebrates), mam (mammals), mar_inv (marine invertebrates), phytoplankton, plants.
- ecosystem: broad ecological context. With levels: exp (experimental e.g., mesocosm), freshwater (riverine, lake or wetland), island (inland or marine island), Marine, Terrestrial, unk (unknown)
- paper: abbreviated citation for study system
- citation: full source citation
File: moddat_V2.csv
Description: data used in regression modelling, with the following variables derived from the raw datasets (in file "dflst_177.RData") and calculated as described in the R Script (SuppInfoRscriptSCBD.R).
Variables
- ID: unique species record identifier
- scbd.obs: observed SCBD for each dataset
- scbd.rnk: normalised ranking of SCBD (on the interval [0,1) - ie, where ranking 1 means the species made the greatest contribution to beta diversity in that ecological community).
- rad: relative abundance of species (number of individuals in each species/sum of all individuals in all species).
- ofd: occupancy (number of occupied sites by each species / total number of sites).
- inv_radofd: (inverse of the product of the frequency of relative abundance and frequency of occupancy).
- ln_inv_radofd: natural log transform of the above.
- dom.rnk90: binary variable, where 1 indicates a species making a top decile SCBD in that datasets, 0 indicating it does not.
- dom.rnk75: as for dom.rnk90, but where 1 indicates a species making a top quartile SCBD in that dataset.
- Imses: continuous variable, quantifying species spatial aggregation using the Morisita index as a standardized effect size.
- source: unique code indicating an independent dataset.
- study: code indicating unique study systems that the dataset indicated by 'source' was observed.
File: nulldat.csv
Description: data outputs from the three null model simulations described in the manuscript.
Variables
- sort: integer code indicating a unique species record
- scbd.obs: observed SCBD for each dataset
- scbd.rnk: normalised ranking of SCBD (on the interval [0,1) - ie, where ranking 1 means the species made the greatest contribution to beta diversity in that ecological community).
- ave_null_abocc: the average SCBD a species would attain due to both its abundance and occupancy if individuals were randomly distributed, derived from null model simulations described in the manuscript.
- ave_null_ab = the average SCBD a species would attain in that dataset if due only to its abundance, derived from null model simulations described in the manuscript.
- ave_null_occ = the average SCBD a species would attain in that dataset if due only to its occupancy, derived from null model simulations described in the manuscript.
- source: unique code indicating an independent dataset.
- study: code indicating unique study systems that the dataset indicated by 'source' was observed.
File: dflst_177.RData
Description: = a list object including the 177 dataframes, in .RData format (needs to be imported to R to be used). Names for each dataset are the same as found in the metadata under the field 'filename'.
Code/software
All analyses were run using the programming language R, (Version 4.4.1) run using the GUI RStudio (V 2024.04.2).
Analyses can be re-produced using the provided R script file !SI_Rscript_SCBD.R
Further analyses requested during review are found in the R script file !SI_Rscript_R1.R
To re-run analyses, save the data files archived in this repo and the script into a single Windows folder and open the script from that location. Provided R is loaded, the script will load the necessary data files, re-run the analyses and reproduce the figures in the manuscript. (note it will be necessary to install and load custom packages detailed in the script to the users computer).
Access information
Other publicly accessible locations of the data:
See also Jeliazkov, A., D. Mijatovic, S. Chantepie, N. Andrew, R. Arlettaz, L. Barbaro, N. Barsoum et al. 2020. A global database for metacommunity ecology, integrating species, traits, environment and space. Scientific Data 7:e6. (CESTES database data paper available at: https://www.nature.com/articles/s41597-019-0344-7 )
2. https://figshare.com/collections/Null_model_analysis_of_species_associations_using_abundance_data/3303603 (supporting information from Supporting information for Ulrich, W., and N. J. Gotelli. 2010. Null model analysis of species associations using abundance data. Ecology 91:3384-3397.
Methods
The dataset used for analysis was calculated from 177 different datasets in 117 different study systems collated from 3 published databases: (i) The metaCommunity Ecology:
Species, Traits, Environment and Space (CESTES) database (Jeliazkov et al. 2020); (ii) Ulrich and Gotelli (2010), and, (iii) Deane et al. (2020).
Each sites x species abundance dataset was analysed separately by calculating each species contribution to beta diversity (SCBD; Legendre and De Caceres 2013), which was the response variable. Raw SCBD scores were converted to normalised SCBD rank by dividing the rank (highest observed SCBD being rank 1) by the number of species in the metacommunity. Thus, SCBD.rnk was on the interval [0, 1). Explanatory variables extracted from the raw data were the number of individuals for all species across all sites (relative abundance) and the number of sites that each species was observed (occupancy).
Jeliazkov, A., D. Mijatovic, S. Chantepie, N. Andrew, R. Arlettaz, L. Barbaro, N. Barsoum et al. 2020. A global database for metacommunity ecology, integrating species, traits, environment and space. Scientific Data 7:e6.
Ulrich, W., and N. J. Gotelli. 2010. Null model analysis of species associations using abundance data. Ecology 91:3384-3397.
Deane, D. C., P. Nozohourmehrabad, S. S. D. Boyce, and F. L. He. 2020. Quantifying factors for understanding why several small patches host more species than a single large patch. Biological Conservation 249:e108711.
Legendre, P., and M. De Caceres. 2013. Beta diversity as the variance of community data: dissimilarity coefficients and partitioning. Ecology Letters 16:951-963.