Skip to main content
Dryad

Data from: Reconstructing 120 years of climate change impacts on Joshua tree flowering

Cite this dataset

Yoder, Jeremy (2024). Data from: Reconstructing 120 years of climate change impacts on Joshua tree flowering [Dataset]. Dryad. https://doi.org/10.5061/dryad.9kd51c5rr

Abstract

Quantifying how global change impacts wild populations remains challenging, especially for species poorly represented by systematic datasets. Here, we infer climate change effects on masting by Joshua trees (Yucca brevifolia and Y. jaegeriana), keystone perennials of the Mojave Desert, from 15 years of crowdsourced observations. We annotated phenophase in 10,212 geo-referenced images of Joshua trees on the iNaturalist crowdsourcing platform, and used them to train machine learning models predicting flowering from annual weather records. Hindcasting to 1900 with a trained model successfully recovers flowering events in independent historical records, and reveals slightly rising frequency of conditions supporting flowering since the early 20th Century. This reflects increased variation in annual precipitation, which drives masting events in wet years — but also increasing temperatures and drought stress, which may have net negative impacts on recruitment. Our findings reaffirm the value of crowdsourcing for understanding climate change impacts on biodiversity.

README: Studying Joshua tree flowering with crowd-sourced observations

Readme updated 13 June 2024.

Project description

This repo contains code to (1) use the iNaturalist API to download species observations based on phenology annotations, by modifying code from the rinat package, and (2) model the relationship between flowering and weather using spatially interpolated records from PRISM and Bayesian additive regression tree methods implemented in dbarts with utilities from embarcadero.

The code is provided as supporting data for

Yoder JB, AK Andrade, LA Defalco, TC Esque, CJ Carlson, DF Shryock, R Yeager, and CI Smith. 2024. Reconstructing 120 years of climate change impacts on Joshua tree flowering. Ecology Letters.

Contents

Subfolder names and contents:

  • data --- created programmatically to store raw data, as output by the inat_phenology_download.R script, PRISM climate layers, and validation records; not retained in this repository
  • protocol_manual --- a PDF document describing our protocol for adding phenology annotations to the iNaturalist database, with supporting Markdown and image files, suitable for compilation using Pandoc
  • scripts --- all project scripts, all in R
    • get_inat.R --- script to load rinat and modify the get_inat function to allow searches on phenology state annotation.
    • inat_phenology_download.R --- script to use get_inat.R to download phenology-annotated observations using the iNaturalist API
    • PRISM_data-management.R --- downloads monthly climate data by year, summarizes to quarterly, normalizes to 1981-2010, and crops to the Mojave extent to build a repository for downstream work
    • inat_phenology_data-management.R --- organization of data output from inat_phenology_download.R, and pairing with PRISM data with functionality from the prism package.
    • phenology_modeling.R --- modeling annualized, rasterized observations of flowering (or no flowering) predicted with weather data using Bayesian additive regression tree (BART) methods
    • phenology_prediction.R --- fits random intercept models based on the best-fit from phenology_modeling.R, with year as the random effect, uses that to predict flowering from PRISM data for 1900-present
    • historic_flowering_analysis.R --- analysis of the historic flowering prediction from phenology_modeling.R
    • validation_records_analysis.R --- compares historic flowering predictions to independent validation records
  • output --- processed data products, modeling results, analysis and figures. See below for annotation of columns in data product files
    • flowering_obs_rasterized.csv --- flowering observations, rendered binary (flowering/no) and rasterized to the 4km PRISM resolution, output from inat_phenology_data-management.R
    • flowering_obs_climate.csv --- rasterized binary flowering observations merged with predictors generated from PRISM weather data, and annotated for Joshua tree species (eastern Yucca jaegeriana or western Yucca brevifolia), output from inat_phenology_data-management.R --- this is the basis for model training
    • expert_obs_validations.csv and news_reports_prFL.tsv --- independent records validating model predictions
    • predicted flowering from the hindcast models for 4-km grid cells in the range of Joshua tree:
    • historic_flowering_reconst_jotr.csv, --- for non-RI and RI models trained on data from eastern and western Joshua trees;
    • historic_flowering_reconst_YUJA.csv and historic_flowering_reconst_YUBR.csv --- for models trained from either eastern (YUJA) or (YUBR) data alone;
    • jotr_reconstructed_flowering_years.csv and jotr_flowering_predictors_change.csv --- reconstructed flowering years in early (1900-1929) and recent (1990-2019) 30-year periods, changes in flowering years between those periods, and changes in weather predictors of flowering between those periods, in each grid cell covered by the historic flowering reconstructions;
    • BART --- saved BART models and prediction layers
    • spartials --- raster-stack data layers (paired .gri and .grd files) giving spatial partial effects of the six model predictors for each year 1900 to 2023
    • predictions --- raster-stack data layers (paired .gri and .grd files) giving predicted probability of flowering masked to the range of the two Joshua tree species, or unmasked (_nomask), for each year of 1900 to 2023, for different models as follows:
      • jotr_BART --- "base" model, trained on data from both species
      • jotr_BART_RI --- RI (random intercept) model, trained on data from both species
      • YUJA_BART --- base model, trained on data from YUJA (Yucca jaegeriana, eastern Joshua tree) alone
      • YUBR_BART --- base model, trained on data from YUBR (Yucca brevifolia, western Joshua tree) alone
    • models --- R data format (.rds) files for trained models, the variable importance analysis (bart.varim.Jotr.rds), and year-year leave-one-out analyses for the "base" and RI models trained on data from both species (year-year-LOO.csv and RI_year-year-LOO.csv), all products of the phenology_modeling.R script, which explains their specific contents
    • figures --- figures output by analysis of data in the output folder; all are in the paper or SI, with captions provided in those documents

Usage

Some of these steps can be skipped once data is downloaded/organized, but to perform the full analysis for the first time, use the scripts in this order:

  1. First use inat_phenology_download.R to download all of the iNaturalist observations with flowering status annotated (this script sources get_inat.R);
  2. Then use PRISM_data-management.R to download spatially interpolated weather data at 4km resolution and process it into quarterly aggregates, then into the composite ;
  3. Then, use inat_phenology_data-management.R to match iNat observations to weather results for the years leading up to each observation;
  4. Finally, use phenology_modeling.R to builds a BART model predicting flowering status with weather data, which are stored in output/flowering_obs_climate_normed.csv; and use phenology_prediction.R with the resulting model to predict what flowering was like in years when we have weather data but no iNaturalist observations.
  5. Analysis of the predicted/hindcast flowering is conducted with code in historic_flowering_analysis.R and compared to independent validation records in valdiation_records_analysis.R

Contents of data product files in output

  • flowering_obs_rasterized_subsp.csv --- binary records of flowering activity by flowering year and location, rasterized to the 4km^2 grid of PRISM data layers
    • lon: longitude, in decimal degrees
    • lat: latitude, in decimal degrees
    • type: Joshua tree species, either YUJA (Yucca jaegeriana, eastern Joshua tree) or YUBR (Yucca brevifolia, western Joshua tree)
    • year: flowering year observed
    • flr: whether or not evidence of flowering (buds, flowers, or fruit) is recorded for the given location and year
  • flowering_obs_climate_subsp.csv --- records as in flowering_obs_rasterized_subsp.csv with weather predictor values for the same locations, derived from the PRISM database. Columns in common with flowering_obs_rasterized_subsp.csv are as given above; weather predictor columns follow the following naming conventions:
    • ppt: precipitation, in mm
    • vpdmin and vpdmax: annual minimum and maximum vapor pressure deficit, in hPa
    • tmin and tmax: annual minimum and maximum temperature, in Centigrade
    • Y0: predictor value for the year in which flowering activity is observed
    • Y1: predictor value for the year one year prior to observation
    • Y2: predictor value for the year two years prior to observation
    • Y0Y1: predictor value difference, Y0 - Y1
    • Y1Y2: predictor value difference, Y1 - Y2
  • derived_predictors_jtrange.csv --- values for the set of weather predictors in jotr_reconstructed_flowering_years.csv, for each 4km grid cell in the range of the two Joshua tree species, in each flowering year from 1900 to 2022; column names are as in other data product files
  • historic_flowering_reconst_jotr.csv, historic_flowering_reconst_YUJA.csv, and historic_flowering_reconst_YUBR.csv --- modeled historic flowering activty for grid cells within the range of both Joshua tree species (_jotr) or individual species (YUJA and YUBR):
    • lon: longitude, in decimal degrees
    • lat: latitude, in decimal degrees
    • year: flowering year predicted
    • prFL: probability of flowering predicted by the "base" model (without a RI effect); note that the best-power cutoff for binary classification is prFL >= 0.26
    • RI.prFL: probability of flowering predicted by the RI model; best-power cutoff for binary classification is RI.prFL >= 0.25
  • jotr_reconstructed_flowering_years.csv --- flowering years predicted by the "base" and RI models in each 4km grid cell across the range of both Joshua tree species, over the specified time periods
    • lon: longitude, in decimal degrees
    • lat: latitude, in decimal degrees
    • flyrs_all: total flowering years predicted by the base model, 1900-2022
    • flyrs_1900_1929: total flowering years predicted by the base model, 1900-1929
    • flyrs_1990_2019: total flowering years predicted by the base model, 1990-2019
    • flyrs_RI_all: total flowering years predicted by the RI model, 1900-2022
    • flyrs_RI_1900_1929: total flowering years predicted by the RI model, 1900-1929
    • flyrs_RI_1990_2019: total flowering years predicted by the RI model, 1990-2019
    • flyrs_change: difference in flowering years between the recent (1990-2019) and early (1900-1929) periods, as predicted by the base model; positive values indicate more flowering years in the recent period
    • flyrs_RI_change: difference in flowering years between the recent (1990-2019) and early (1900-1929) periods, as predicted by the RI model; positive values indicate more flowering years in the recent period
  • jotr_flowering_predictors_change.csv --- recent (1990-2019) versus early (1900-1929) flowering years predicted by the base and RI models, compared to change in median weather predictor values for the same locations and periods
    • lon: longitude, in decimal degrees
    • lat: latitude, in decimal degrees
    • timeframe: either early (from_1900_1929) or recent (from_1990_2019)
    • flyrs: number of flowering years predicted for the given location and timeframe
    • ri.model: whether flowering years are as predicted by the base (FALSE) or RI (TRUE) model
    • predictor: weather predictor identity from the set in jotr_reconstructed_flowering_years.csv, formatted for figure production
    • pred_value: median value for the given weather predictor in the given location and timeframe
  • expert_obs_validations.csv --- formal records of flowering activity derived from field notes, herbarium records, and other published datasets, for use in model validation
    • lon: longitude, in decimal degrees
    • lat: latitude, in decimal degrees
    • location: locality identification or description given by the data source
    • type: Joshua tree species, either YUJA (Yucca jaegeriana, eastern Joshua tree) or YUBR (Yucca brevifolia, western Joshua tree)
    • year: flowering year of observation
    • obs_by: data source
    • obs_flowers, obs_fruit, obs_moths, and obs_no_flowers: whether or not the source records observations of flowers, fruits, pollinating yucca moths, or no evidence of flowering
    • flr: binary classification of source observations into evidence of flowering (flowers or fruits recorded) or no flowering
    • jotr_prFlr, jotr_RI.prFlr, YUBR_prFlr, and YUJA_prFlr: probability of flowering for the given location and year, predicted by the base model or RI model trained on data from both species (jotr); or non-RI models trained on data from each species individually (YUJA and YUBR)
    • jotr_Flr, jotr_RI.Flr, YUBR_Flr, and YUJA_Flr: binary classifications of flowering activity for the given location and year, from predictions by the base model or RI model trained on data from both species (jotr); or non-RI models trained on data from each species individually (YUJA and YUBR)
  • news_reports_prFL.tsv --- summary information from historic newspaper accounts of flowering activity
    • year --- flowering year of the account
    • location --- location described in the account
    • rpt_by --- newspaper source for the account
    • rpt_date --- date of publication for the account, yyyy-mm-dd format
    • flr --- whether intense flowering is reported (TRUE) or poor flowering is reported (FALSE)
    • qt --- excerpt of article text supporting the classification in flr
    • prFL.loc --- probability of flowering predicted by the base model in the flowering year described by the report, averaged within a polygon for the specified location
    • prFL.all --- probability of flowering predicted by the base model in the flowering year described by the report, averaged over the full range of the two Joshua tree species

Funding

National Science Foundation, Award: 2001190, DEB

National Science Foundation, Award: 2001180, DEB

California State University, Northridge