Serpentine plant community surveys (3 years) following a dispersal manipulation at McLaughlin Natural Reserve
Data files
Sep 09, 2025 version files 94.64 KB
-
data_clean.zip
20.11 KB
-
data_raw.zip
63.95 KB
-
README.md
10.59 KB
Abstract
While investigating how historical contingencies manifest in ecological communities is common, research has yet to uncover how species distributions and abundances may be impacted by ecological processes interacting across spatial and temporal scales. We assessed the influence of a historical dispersal event on community assembly by simulating five scales of dispersal for 240 serpentine annual plant communities that experienced a large shift from drought to high rainfall conditions over three years. We collected, aggregated, and redistributed the aboveground seed bank from 30 sites across five nested spatial scales (~1 m, 5 m, 100 m, 5 km, and 10 km), and annually censused communities over a three-year period to identify change in community structure among dispersal scales and over time. We assessed the change in community composition and diversity that occurred over the study period by comparing contrasts of Bray-Curtis disimilarity and net changes in species richness between a baseline control plot and each experimental dispersal treatment plot. We also determined how species turnover through time was affected by dispersal scale, including how the number of established, sink, and intermittent species found in experimental plots were impacted by our past dispersal manipuation. Finally, we examined species’ differences in response to environmental fluctuations (i.e. stress-tolerant vs. stress-intolerant) and whether dispersal enhances these differences to increase turnover in dominant species groups. Our one-time dispersal manipulation diversified local seed banks, thereby providing insurance against temporal variability and supporting community re-assembly toward a new compositional state when drought was lifted. We also found evidence that various forms of temporal lags directed community responses to environmental fluctuations, preventing rare species extirpations and providing subordinate species discrete windows of time to supplement their seed banks. This experiment underscores the importance of dispersal and dormancy for diversity maintenance in the face of future climate where the degree and magnitude of fluctuations are uncertain.
We have submitted the raw (data_raw.zip) and clean (data_clean.zip) data files and R scripts (Rmarkdown files) needed to perform the analyses described in our article: 'Experimentally enhancing dispersal reveals the outsized importance of transient dynamics in a fluctuating environment'. The article investigates the assembly trajectories of communities following a dispersal manipulation under the background of a changing climate (from drought to non-drought).
Data collection
In the summer of 2013, 30 serpentine grassland sites were chosen from the northwestern and southeastern ends of McLaughlin Natural Reserve in northern California. Each site was then divided into eight 0.75 m x 0.75 m plots that were separated by 1 m of bare ground and arranged in a 2 x 4 block, for a total of 240 plots (8 plots x 30 sites). Sites were chosen such that they could be hierarchically grouped at five spatial scales: at the level of a single plot (1 m), a block of plots at a single site (5 m), a group of sites (100 m), half of the reserve (5 km), and the whole reserve (10 km). After the winter species had senesced and most of the summer annuals had set seed, all standing vegetation and the loose layers of the seedbank were harvested using garden shears and a gas-powered leaf vacuum (Stihl BG86). Prior to processing, the collected material was left outdoors in paper bags to heat stratify for approximately six weeks.
In late August of 2013, each of the eight plots per site were randomly assigned to receive one of five dispersal treatments or one of three control treatments. Treatments plots simulated different spatial scales of dispersal, corresponding to our nested plot design, by pooling, homogenizing, and redistributing the collected material at five scales: from a single plot (1 m), multiple plots at a single site (5 m), five sites in a group (100 m), 15 sites within half of the reserve (5 km), and all 30 sites from across the entire reserve (10 km). As a result of the pooling manipulation, the richness and composition of the species pool plots received differed across treatments as more species were introduced as dispersal scale increased, mimicking the effect of natural dispersal on species pools. Control plots were then created as follows: “C1” was vacuumed without replacement, to identify species that were left behind following vacuuming, “C2” was left unaltered to assess for unintended effects of the seed collection procedure on diversity, and “C3” was vacuumed, homogenized, and redistributed to examine effects caused by the vacuuming procedure itself (compared to “C1”).
Description of the data & file structures
The datasets included in this repository are as follows:
Located in data_raw
- 2014_survey_RMG.csv, 2015_survey_RMG.csv, 2016_survey_RMG.csv: community survey data for 2014, 2015, 2016 (including both annual and perennial species)
- species_info.csv: list of known species that exist in the area and information about their growth forms, native status, and life history strategy.
- clim_decade.csv: monthly climate data for the years 2014-2016
- disp_mode_RMG.csv: database of species habitat affinities in McLaughlin Natural Reserve from Szojka and Germain 2024. Dispersing across habitat boundaries: Uncovering the demographic fates of populations in unsuitable habitat. Ecosphere 15:e4814. https://doi.org/10.1002/ecs2.4814
Located in data_clean
- affinities_vis.csv, beta_vis.csv, c2_cover_vis.csv, count_cld.csv, count_vis.csv, cover_vis.csv, decomp_vis.csv, jacc_beta_vis.csv, net_rich_vis.csv, pred_burn_vis.csv, rate_cld.csv, rate_vis.csv, rich_cld_yr.csv, rich_cld.csv, rich_vis_yr.csv, rich_vis.csv, temp_beta_vis.csv,: model outputs generated in 03stats.Rmd that were used for plotting in 02data_vis.Rmd, where "_cld" indicates dataset contains statistical significance letters, and "_vis" indicates dataset contains model outputs for plotting. All of these data frames can be generated using 03stats.Rmd, but are included here for ease of use.
Scripts
01data_clean.Rmd, 02data_vis.Rmd, & 03stats.Rmd: R markdown documents containing script for data cleaning and wrangling (01data_clean), figure creation (02data_vis), and statistical analyses (03stats) for our article. 01data_clean.Rmd creates the primary data frames needed to run the statistical analyses in 03stats.Rmd and should always be run first (either manually or using the source_rmd function found in 02data_vis.Rmd and 03stats.Rmd).
Survey Data
Dataset overview:
The survey data was collected twice in 2014, 2015, and 2016 at McLaughlin Natural Reserve in northern California, once in early spring (April/May) and again in late summer (August) to confirm species identities. Surveys were conducted at the plot level and species % cover recorded. Species were categorized as either edge species (negative % cover values) or not (positive % cover values). Each plot can be identified by its site location (site 1-30) and dispersal distance/treatment, which were originally recorded according to the colour-coded organization system outlined below:
Dispersal treatment codes:
Black = C1 (vacuum w/out replacement);
Pink = C2 (unaltered);
Blue = C3 (vacuum w/out movement);
Green = 1m;
Yellow = 5m; Orange = 100m;
Purple = 5km;
Red = 10km
Data description: 2014_survey_RMG.csv.
The csv file contains the following columns:
site: identification number of experimental sites (1-30);
plot: identification number of experimental plots located within sites (1-8);
treatment: dispersal treatment assigned to plots (see colour code above);
species: abbreviated code representing identified species (see 'Species data' for codes);
cover: percentage of plot occupied by species (negative values = edge species, see above description)
Data description: 2015_survey_RMG.csv.
The csv file contains the following columns:
site: identification number of experimental sites (1-30);
plot: identification number of experimental plots located within sites (1-8);
treatment: dispersal treatment assigned to plots (see colour code above);
species: abbreviated code representing identified species (see 'Species data' for codes);
cover: percentage of plot occupied by species (negative values = edge species, see above description);
dataset: the time of year species were identified (see above description)
Data description: 2016_survey_RMG.csv.
The csv file contains the following columns:
site: identification number of experimental sites (1-30);
plot: identification number of experimental plots located within sites (1-8);
treatment: dispersal treatment assigned to plots (see colour code above);
species: abbreviated code representing identified species (see 'Species data' for codes);
cover: percentage of plot occupied by species (negative values = edge species, see above description)
Species Data
Dataset overview:
The species list was collated from a local ID book based on those observed to be inhabiting the reserve over the study period. Information regarding growth form, regional status, and family were compiled using CalFlora Plant Database (www.calflora.org).
Data description: species_info.csv.
The csv file contains the following columns:
colnames(species.df.pa): abbreviated code representing identified species;
growth.form: plant life history strategy where A = annual & P = perennial;
latin.name: species latin name;
status: regional species status where n = native & e = exotic;
family: general plant classification where g = grass & f = forb
Climate Data
Dataset overview:
Climate data was retrieved from the Western Regional Climate Center online database (https://wrcc.dri.edu), accessed through the UC Davis natural reserves website. The climate data was collected at the RAWS Knoxville Creek Weather Station near the McLaughlin field station. The data was generated in a monthly summary time series report from January 2012 to December 2022.
Data description: clim_decade.csv.
The csv file contains the following columns:
year: calendar year;
month: calendar month encoded numerically (i.e, 1 = January);
av_dailymax_temp_F: average maximum daily air temperature (degrees Fahrenheit);
av_dailymin_temp_F: average minimum daily air temperature (degrees Fahrenheit);
dailyav_temp_F: average daily air temperature (degrees Fahrenheit);
precip_in: total monthly precipitation (in)
Habitat affinities
Dataset overview:
Species habitat affinities were assigned by Szojka and Germain (2024) using a 7-year historical dataset of species' cover in serpentine and non-serpentine sites at McLaughlin Natural Reserve (80 sites total, courtesy of S. Harrison). For more information on how habitat affinities were assigned, see 'Assignment of species affinities' in the Supplementary Information associated with our article. Please contact S. Harrison with inquires about the raw dataset that was used to construct species habitat affinities.
Data description: disp_mode_RMG.csv.
The csv file contains the following columns:
species: abbreviated code representing identified species;
status_sh: habitat assignment where m = matrix (non-serpentine), p = patch (serpentine), g = generalist (both serpentine & non-serpentine), and r = rare (habitat unassigned)
The raw data files have all been cleaned and wrangled to create the data frames used for generating the figures and statistical analyses in our article 'Experimentally enhancing dispersal reveals the outsized importance of transient dynamics in a fluctuating environment' (see 01data_clean.Rmd). All figures for publication can be found in 02data_vis.Rmd, which requires all raw data files, cleaned data files, and 01data_clean to run successfully. Note: all plots generated for publication in 02data_vis were tidied for visual clarity and aesthetics only (e.g., orientation and placement of plots/legends/statistical significance indicators, addition of cartoon icons, etc.) using Affinity Designer. Full descriptions of data frame structures for those used in each analysis are given in 03stats.Rmd. To run the code in 03stats.Rmd, all raw data files and 01data_clean.Rmd are required.
