Temporal changes in taxon abundances are positively correlated but poorly predicted at the global scale
Data files
Nov 08, 2024 version files 237.27 MB
-
Data_and_code_(3).zip
237.26 MB
-
README.md
5.44 KB
Abstract
Linking changes in taxon abundance to biotic and abiotic drivers over space and time is critical for understanding biodiversity responses to global change. Furthermore, deciphering temporal trends in relationships among taxa, including correlated abundance changes (e.g. synchrony), can facilitate predictions of future shifts. However, what drive these correlated changes over large scales are complex and understudied, impeding our ability to predict shifts in ecological communities. We use two global datasets containing abundance time-series (BioTIME) and biotic interactions (GloBI) to quantify correlations among yearly changes in the abundance of pairs of geographically proximal taxa (genus pairs). We use a hierarchical linear model and cross-validation to test the overall magnitude, direction, and predictive accuracy of correlated abundance changes among genera at the global scale. We then test how correlated abundance changes are influenced by latitude, biotic interactions, disturbance, and time-series length while accounting for differences among studies and taxonomic categories. We find that abundance changes between genus pairs are, on average, positively correlated over time, suggesting synchrony at the global scale. Furthermore, we find that abundance changes are more positively correlated with longer time-series, with known biotic interactions, and in disturbed habitats. However, the magnitude of these ecological drivers alone are relatively weak, with model predictive accuracy increasing approximately two-fold with the inclusion of study identity and taxonomic category. This suggests that while patterns in abundance correlations are shaped by ecological drivers at the global scale, these drivers have limited utility in forecasting changes in abundances among unknown taxa or in the context of future global change. Our study indicates that including taxonomy and known ecological drivers can improve predictions of biodiversity loss over large spatial and temporal scales, but also that idiosyncrasies of different studies continue to weaken our ability to make global predictions.
README: Temporal changes in taxon abundances are positively correlated but poorly predicted at the global scale
https://doi.org/10.5061/dryad.63xsj3vbc
Description of the data and file structure
This directory stores the scripts and data used in the analyses for "Temporal changes in taxon abundances are positively correlated but poorly predicted at the global scale". There are three sub directories, each with nested folders.
- output/
This is a folder with sub-directories that organizes the output of scripts--intermediate data steps, figures, and model objects. There are 3 subfolders: figures, models, and prep_data.
- raw biotime data
contains the raw data downloaded from BioTime that are used in this analysis
- scripts/
This folder contains has sub directories that organize all the scripts used in this analysis, data curation, cleaning, modelling, and figure creation. This folder contains 3 subfolders: figures, models, and prep data
Files and variables
File: File_structure.zip
Description:
This directory stores the final scripts used in the data processing and analysis for the working group after manuscript revision and review. There are three sub directories, each with it's own readme with detailed information about what files are used for and what scripts are created.
output/
This is a folder with sub-directories that organizes the output of scripts--intermediate data steps and model objects
models
contains the saved model outputs from running thelmer_models.R
analysis:- `lmer_model_final.Rdata` - this is the primary model output
- `lmer_model_kfold.Rdata` - cross-validation output with folds not accounting for studyID or taxaxonomic category
- `lmer_model_kfoldstudy.Rdata` - cross-validation output with folds for studyI
- `lmer_model_kfoldtax.Rdata` - cross-validation output with folds for taxaxonomic category
-
prep_data
contains the intermediate data the arises from the data cleaning scripts and then gets fed into the modelling scriptmodel_data_final.Rdata
- final and cleaned data used to fit the modelresults.abundance.csv-
output fromresults.abundance.data.setup.R
, called in setup_model data.R`. Contains log proportional pairwise changes in abundance across generwithin.study.updated.interactions.020724ENB.csv
- created in`pair_interactions_taxize_all.R`
and called in same script, intermediate outpuresults_abundance_interactions_taxa_032024ENB.RDS
- created inpair_interactions_taxize_all.R
, called in setup_model data.R. Contains genus pair log proportional changes and whether or not each genus per genera pair has interaction data from Globi. Final output ofpair_interactions_taxize_all.
worldclim.csv
- output fromworldclim summaries.R
script, called in`setup_model data.R`
. Contains mean annual temperature and precipitation values for each longitude and latitude coordinate in our cleaned BioTime datadisturbance cleaning.csv
- manually cleaned BioTime disturbance data down to the plot level, called insetup_model data.R
. NAs in this file indicate absence of information for those cells.
raw biotime data
empty folder to follow the licensing requirements of BioTime--please download the BioTime data directly from the BioTime website https://biotime.st-andrews.ac.uk/
scripts/
This folder contains has sub directories that organize all the scripts used in this analysis, data curation, cleaning, modelling, and figure creation
Code/software
The scripts
folder contains the files used in the data processing and analysis for manuscript. There are three sub directories:
models
contains 2 scripts used to run the statistical analyses- `setup_model data` - This file collates all the intermediate data files from the data pre-processing and converts it into the form to fit the model (adds metadata, calculates pearson correlations and z-scores etc)
- `lmer_models` - This file script that runs our primary model, calculates the cross-validation, and runs the null model to verify our model results.
prep data/
This folder the contains scripts that take the raw data from BioTime and cleans it, calculate the log proportional changes, and the GloBi interactions- results.abundance.data.setup.R - filters and cleans the data, calculates log proportional changes and richness for genera-pairs
- pair_interactions_taxize_all.R - adds the GloBi interactions and taxonomic class with manual corrections being performed.
- worldclim summaries.R - pulls the worldclim summaries for each plot in each study. Note that we did not end up using this data in analyses.
figures/
- `Manuscript_figures.R` which contains all the code for creating the main figures used in the manuscript, the inserts for figure 2, the supplementary figures, and for calculating the results.
- `generate_table_attributions.RMD` which tabulates and organizes each of the studies that we used for citation purposes. This is a supplementary table in our manuscript.
Access information
Publicly accessible locations of the data:
Data was derived from the following sources:
Methods
This dataset was collected by downloading and curating existing data from BioTIME and GlobI. The BioTime data was filtered by (see Figure 2 of the manuscript) excluding biomass, marine, aquatic surveys, aggregating abundance to genus level per plot. We subset data to include only time series that contain 10+ consecutive overlapping years. For each genus, we calculated the log proportional change in abundance for each time step to remove temporal autocorrelation. We used 'Globi to identify if there are known interactions between each genus pair.