Assessing seasonal richness of active flowers throughout UC Reserve sites in the 20th Century
Abstract
Plant species are well documented to alter both the timing and duration of their flowering in response to changing climate. Plant species often exhibit different magnitudes or directions of phenological responses to climate changes from each other. These shifts may have cumulative effects on the diversity of species in flower throughout a given flowering season, resulting in periods of high or low species richness of actively flowering community members that differ from those that occurred under historical conditions.
In this study we model the effects of warming throughout the past century on the daily species richness of actively flowering species by developing species-specific phenoclimate models for 1,848 plant species documented to inhabit 16 well documented plant communities across California. These communities encompassed a variety of distinct vegetation types, ranging from coastal marshes and grasslands to chaparral shrublands and mountainous conifer forests.
By examining consistent patterns in the resultant modeled community-level flowering displays, we demonstrate that recent warming is likely to have consistently shortened the period in which many species flower concurrently, and that the bloom season has advanced by nearly 5 days on average. Accordingly, within every flora, recent warming was predicted to increase the daily species richness of active flowers early in the local growing season, with corresponding reductions in species richness of active flowers later in the growing season. Notably, patterns of change in community-level bloom displays were driven primarily by differences among species in the timing of flowering onset, as termination dates tended to advance in unison with onset dates, resulting in minor changes to flowering duration among species.
This README.txt file was generated on 2024-08-01 by Isaac Park
GENERAL INFORMATION
1. Title of Dataset: Community-Level flowering phenology patterns among University of California Reserves
2. Author Information Isaac W. Park, Department of Biology, Georgia Southern University; Tadeo Ramirez-Parada, Department of Ecology, Evolution, and Marine Biology, University of California Santa Barbara; Susan J. Mazer, Department of Ecology, Evolution, and Marine Biology, University of California Santa Barbara
3. Date of data collection 10/10/2023:
4. Geographic location of data collection: California, USA, models trained on data from throughout North America
5. Development of these datasets and associated python code were supported by the National Science Foundation
SHARING/ACCESS INFORMATION
1. Licenses/restrictions placed on the data: Public Domain
2. Links to publications that cite or use the data:
3. Links to other publicly accessible locations of the data:
4. Links/relationships to ancillary data sets: Raw observation locations used to underpin simulations in this data can be acquired from USANPN.org. Climate data is available through https://www.prism.oregonstate.edu/. Observational data used by this study is available through data supplements associated with this dryad submission as Herbarium_AllData_Core_noDups.csv: This dataset represents raw herbarium specimens collected from all contributing herbaria, with the additional data added through the processing described in the attached software (as hosted at https://doi.org/10.25349/D9WP6S), and after duplicate specimens have been removed; Notably, corrected year and DOY of each collection as well as binary presence/abscence for all phenological phases of interest. These data include all identification fields for re-acquiring original specimen records from contributing herbaria. While additional data fields were added (as listed above), no data columns present in raw data (as downloaded from contributed herbaria) were edited in any way. This dataset should be placed within the Data/HerbData folder when using the code associated with this depository, but is stored separately from other data files due to differences in licencing. Number of variables: 365 Number of cases/rows: 2314813 Variable List: variable names for columns 7:39 are Darwin Core terms, We provide the definition for each term in the README document. These definitions come directly from the Darwin Core website (https://dwc.tdwg.org/list/#2-use-of-terms (opens in new window)). Note: All columns if original darwincore were preserved, even if supplanted by modified data columns (most notably those fields associated with the taxonomic identification of each specimen or the DOY and year of its collection.) In such cases, we recommend ignoring the original darwincore data fields in favor of the modified data fields, listed below. DOY_Rect: Rectified day of year (DOY) on which the specimen was recorded to have been collected Year_Rect: Rectified year on which the specimen was recorded to have been collected bud: determination of whether the specimen was recorded as being in bud. Values of 1 indicate presence of buds, values of 0 indicate no documentation of buds. flower: determination of whether the specimen was recorded as being in flower. Values of 1 indicate presence of flowers, values of 0 indicate no documentation of flowers. fruit: determination of whether the specimen was recorded as being in fruit. Values of 1 indicate presence of fruits, values of 0 indicate no documentation of fruits. strobilus: determination of whether the specimen was recorded as being in strobilus. Values of 1 indicate presence of strobili, values of 0 indicate no documentation of strobili. cone: determination of whether the specimen was recorded as being in strobilus. Values of 1 indicate presence of seed or pollen cones, values of 0 indicate no documentation of cones. fertile: determination of whether the specimen was recorded as being in strobilus. Values of 1 indicate specimen was collected when fertile (most commonly this occurs in graminoids, as well as some spore-bearing plants), values of 0 indicate no record of the specimen being fertile when collected. phenocolumn_sum: marker field indicating that one or more of the afforementioned phenological statuses was positive. Accepted_name: Genus and species as matched to a standardized taxonomic schema using the taxonomic name resolution service (tnrs.biendata.org). Accepted_name: Genus, species, and subspecies/variety (if applicable) of specimen according to a standardized taxonomic schema using the taxonomic name reolution service (tnrs.biendata.org). Accepted_species: Genus and species of specimen according to a standardized taxonomic schema using the taxonomic name reolution service (tnrs.biendata.org). Accepted_name_author: Author of species identification of specimen's species according to a standardized taxonomic schema using the taxonomic name reolution service (tnrs.biendata.org). Accepted_name_rank: taxonomic rank to which specimen was identified according to a standardized taxonomic schema using the taxonomic name resolution service (tnrs.biendata.org). Name_matched_accepted_family: family to which specimen was identified according to a standardized taxonomic schema using the taxonomic name resolution service (tnrs.biendata.org).
5. Was data derived from another source? yes. Herbarium data was derived from previously processed herbarium data produced by Park et al. and archived at https://doi.org/10.25349/D9WP6S Climate data is available through https://climatena.ca/
6. Recommended citation for this dataset:Park (2024), Community flowering phenology of UC Reserves, Dryad, Dataset
DATA & FILE OVERVIEW
1. File List:
Universal Key to common data fields:
Climate Data: all data fields recording climate data are derived from ClimateNA output, and follow ClimateNA rubrics for naming and abbreviation of climate data (see https://ClimateNA.ca). In all cases, climate data field names will consist of a text string (described below) delineating the aspect of climate being recorded, and a suffix that will either delineate the month being described (if represented by a two digit number) or the season being described (if represented by _wt, representing January-March, by _sp representing April-June, by _sm representing July - September, or _at representing October-December). Some parameters represent annual conditions and therefore exhibit no suffix (described below). Note that all climate data produced by climateNA were included for completeness, although only temperature and precipitation were used in this analysis.
Annual variables:
MAT mean annual temperature (°C),
MWMT mean warmest month temperature (°C),
MCMT mean coldest month temperature (°C),
TD temperature difference between MWMT and MCMT, or continentality (°C),
MAP mean annual precipitation (mm),
MSP May to September precipitation (mm),
AHM annual heat-moisture index (MAT+10)/(MAP/1000))
SHM summer heat-moisture index ((MWMT)/(MSP/1000))
DD_0 (DD<0) degree-days below 0°C, chilling degree-days
DD5 (DD>5) degree-days above 5°C, growing degree-days
DD_18 (DD<18) degree-days below 18°C, heating degree-days
DD18 (DD>18) degree-days above 18°C, cooling degree-days
NFFD the number of frost-free days
FFP frost-free period
bFFP the day of the year on which FFP begins
eFFP the day of the year on which FFP ends
PAS precipitation as snow (mm). For individual years, it covers the period between
August in the previous year and July in the current year.
EMT extreme minimum temperature over 30 years
EXT extreme maximum temperature over 30 years
Eref Hargreaves reference evaporation (mm)
CMD Hargreaves climatic moisture deficit (mm)
MAR mean annual solar radiation (MJ m‐2 d‐1 )
RH mean annual relative humidity (%)
CMI Hogg’s climate moisture index (mm)
DD1040 (10<DD<40) degree-days above 10°C and below 40°C
Seasonal variables:
Tave_wt winter mean temperature (°C)
Tave_sp spring mean temperature (°C)
Tave_sm summer mean temperature (°C)
Tave_at autumn mean temperature (°C)
Tmax_wt winter mean maximum temperature (°C)
Tmax_sp spring mean maximum temperature (°C)
Tmax_sm summer mean maximum temperature (°C)
Tmax_at autumn mean maximum temperature (°C)
Tmin_wt winter mean minimum temperature (°C)
Tmin_sp spring mean minimum temperature (°C)
Tmin_sm summer mean minimum temperature (°C)
Tmin_at autumn mean minimum temperature (°C)
PPT_wt winter precipitation (mm)
PPT_sp spring precipitation (mm)
PPT_sm summer precipitation (mm)
PPT_at autumn precipitation (mm)
RAD_wt winter solar radiation (MJ m‐2 d‐1 )
RAD_sp spring solar radiation (MJ m‐2 d‐1 )
RAD_sm summer solar radiation (MJ m‐2 d‐1 )
RAD_at autumn solar radiation (MJ m‐2 d‐1 )
Derived seasonal variables:
DD_0_wt winter degree-days below 0°C
DD_0_sp spring degree-days below 0°C
DD_0_sm summer degree-days below 0°C
DD_0_at autumn degree-days below 0°C
DD5_wt winter degree-days above 5°C
DD5_sp spring degree-days above 5°C
DD5_sm summer degree-days above 5°C
DD5_at autumn degree-days above 5°C
DD_18_wt winter degree-days below 18°C
DD_18_sp spring degree-days below 18°C
DD_18_sm summer degree-days below 18°C
DD_18_at autumn degree-days below 18°C
DD18_wt winter degree-days above 18°C
DD18_sp spring degree-days above 18°C
DD18_sm summer degree-days above 18°C
DD18_at autumn degree-days above 18°C
NFFD_wt winter number of frost-free days
NFFD_sp spring number of frost-free days
NFFD_sm summer number of frost-free days
NFFD_at autumn number of frost-free days
PAS_wt winter precipitation as snow (mm)
PAS_sp spring precipitation as snow (mm)
PAS_sm summer precipitation as snow (mm)
PAS_at autumn precipitation as snow (mm)
Eref_wt winter Hargreaves reference evaporation (mm)
Eref_sp spring Hargreaves reference evaporation (mm)
Eref_sm summer Hargreaves reference evaporation (mm)
Eref_at autumn Hargreaves reference evaporation (mm)
CMD_wt winter Hargreaves climatic moisture deficit (mm)
CMD_sp spring Hargreaves climatic moisture deficit (mm)
CMD_sm summer Hargreaves climatic moisture deficit (mm)
CMD_at autumn Hargreaves climatic moisture deficit (mm)
RH_wt winter relative humidity (%)
RH_sp spring relative humidity (%)
RH_sm summer relative humidity (%)
RH_at autumn relative humidity (%)
CMI_wt winter Hogg’s climate moisture index (mm)
CMI_sp spring Hogg’s climate moisture index (mm)
CMI_sm summer Hogg’s climate moisture index (mm)
CMI_at autumn Hogg’s climate moisture index (mm)
3\) Monthly variables
Primary monthly variables:
Tave01 – Tave12 January - December mean temperatures (°C)
TMX01 – TMX12 January - December maximum mean temperatures (°C)
TMN01 – TMN12 January - December minimum mean temperatures (°C)
PPT01 – PPT12 January - December precipitation (mm)
RAD01 – RAD12 January - December solar radiation (MJ m‐2 d‐1 )
DD_0_01 – DD_0_12 January - December degree-days below 0°C
DD5_01 – DD5_12 January - December degree-days above 5°C
DD_18_01 – DD_18_12 January - December degree-days below 18°C
DD18_01 – DD18_12 January - December degree-days above 18°C
NFFD01 – NFFD12 January - December number of frost-free days
PAS01 – PAS12 January – December precipitation as snow (mm)
Eref01 – Eref12 January – December Hargreaves reference evaporation (mm)
CMD01 – CMD12 January – December Hargreaves climatic moisture deficit (mm)
RH01 – RH12 January – December relative humidity (%)
CMI01 – CMI12 January – December Hogg’s climate moisture index (mm)
Monthly variables:
Tave01 – Tave12 January - December mean temperatures (°C)
TMX01 – TMX12 January - December maximum mean temperatures (°C)
TMN01 – TMN12 January - December minimum mean temperatures (°C)
PPT01 – PPT12 January - December precipitation (mm)
RAD01 – RAD12 January - December solar radiation (MJ m‐2 d‐1 )
Derived monthly variables:
DD_0_01 – DD_0_12 January - December degree-days below 0°C
DD5_01 – DD5_12 January - December degree-days above 5°C
DD_18_01 – DD_18_12 January - December degree-days below 18°C
DD18_01 – DD18_12 January - December degree-days above 18°C
NFFD01 – NFFD12 January - December number of frost-free days
PAS01 – PAS12 January – December precipitation as snow (mm)
Eref01 – Eref12 January – December Hargreaves reference evaporation (mm)
CMD01 – CMD12 January – December Hargreaves climatic moisture deficit (mm)
RH01 – RH12 January – December relative humidity (%)
CMI01 – CMI12 January – December Hogg’s climate moisture index (mm)
Note - unless otherwise specified, missing values are recorded as missing values in all datasets. In data fields recording climate data or elevation, missing values may be represented by -9999
Data.zip (File): zip file of all remaining datasets used in this study. All datasets included within this zip file are listed below.
Data/tnrs_result_Reserves.csv (File): List of species known to inhabit UC reserve sites as listed in raw herbarium data, and after taxonomic standardization using TNRS
Number of variables: 2
Number of cases/rows: 3280
Variable List: The majority of column names are QA columns referencing the match quality of names rectified through TNRS, and are not relevant to further data processing. Key fields are as follows:
Name_Submitted: unique taxonomic identifier for specimen(s) derived from herbarium data
Accepted_name: Rectified taxon name produced by TNRS, to finest ID possible
Accepted_species: Rectified taxon name produced by TNRS, genus and species only
Missing data codes:''
Specialized formats or other abbreviations used: NA
Data/Reserve_Perims_wgs1984.shp (File): Shapefile of UC reserve perimeters. Shapefiles are vector-based geographic information system (GIS) data developed by the company esri. They can be opened and used in any GIS software and in R or Python. A shapefile consists of multiple file types beyond the .shp (specifically, .cpg, .dbf, .prj, .sbn, and .sbx). The user only interacts directly with the .shp file but the other files need to be in the same directory.
Number of variables: 4
Number of cases/rows: 44
Variable List:
Shape_Leng: Length of shape
Shape_Area: Area of shape
Name: Name of Reserve Site
Campus: Campus that administrates the reserve
Data/ClimateNA_Annual_Climate.csv (File): annual climate conditions associated with unique locations & years of collection for herbarium specimens of species known to inhabit UC reserves sites.
Number of variables: 269
number of cases/rows: 754828
Variable List: The majority of column names are derived from ClimateNA output, and follow ClimateNA rubrics for naming and abbreviation of climate data (see universal key above, or https://ClimateNA.ca). Additional data fields are as follows:
Year_Rect: Year for which climate data was estimated
decimalLatitude: decimal latitude of site for which climate data was estimated
decimalLongitude: decimal latitude of site for which climate data was estimated
Data/ClimateNA_1911_1940_ClimateNormals.csv (File): 1911-1940 normal climate conditions associated with unique locations & years of collection for herbarium specimens of species known to inhabit UC reserves sites.
Number of variables: 268
number of cases/rows: 642890
Variable List: The majority of column names are derived from ClimateNA output, and follow ClimateNA rubrics for naming and abbreviation of climate data (see universal key above, or https://ClimateNA.ca). Data pertaining to locations in which climate data was not available were marked with -9999 in climate fields. Other missing values were blank. Additional data fields are as follows:
decimalLatitude: decimal latitude of site for which climate data was estimated
decimalLongitude: decimal latitude of site for which climate data was estimated
Data/Herb_Data_Ann_ClimateNA.csv (File): Annual Climate data pertaining to the location and year of collection of each specimen used in this study, as well as the species of each specimen and they day of year (DOY) and year of collection.
Number of variables: 30
number of cases/rows: 1710401
Variable List: The majority of column names are derived from ClimateNA output, and follow ClimateNA rubrics for naming and abbreviation of climate data (see universal key above, or https://ClimateNA.ca). Missing values are indicated by -9999. Additional data fields are as follows:
Accepted_species: Species of specimen onder consideration (standardized using TNRS nomenclature)
DOY_Rect: the Day of Year on which that collection occurred (from 1 to 365)
Year_Rect: Year for which climate data was estimated
decimalLatitude: decimal latitude of site for which climate data was estimated
decimalLongitude: decimal latitude of site for which climate data was estimated
Data/reserve_plant_list.csv (File): Data listing each species that is found within each UC reserve site. Note that some values (such as ssp./var) are not relevant to all species and remain blank, while others such as habitat or bloom time may not always have available information and therefore remain blank.
Number of variables: 15
number of cases/rows: 10125
Variable List:
Reserve: Name of reserve site
Genus_Species: Genus and species of plant taxon
Family: family of plant taxon
Genus: Genus of plant taxon
Species: species of plant taxon
ssp./var. designation: subspecies or variety designation (if appropriate)
ssp./var. name: subspecies or variety name (if appropriate)
Division: Division of taxon
Synonym: synonym(s) of taxa (if applicable)
Common Name: Common Name of taxon (if applicable)
Uncertainties in species ID, noted in original flora: Uncertainties in species ID, noted in original flora
Native/Exotic: status as native or exotic of taxon
Habitat (approx): estimated habitat occupied by taxon
Bloom Time (approx): estimated approximate bloom time of taxon
Data/Species_by_reserve_herbdata_100min_full_LatLon_forClimateNA_Normal_1911_1940MSY.csv (File): 1911-1940 normal climate conditions associated with centroid of each UC reserve site
Number of variables: 270
number of cases/rows: 31
Variable List: The majority of column names are derived from ClimateNA output, and follow ClimateNA rubrics for naming and abbreviation of climate data (see universal key above, or https://ClimateNA.ca). Additional data fields are as follows:
id1: records UC reserve site corresponding to climate data
id2: unique ID for each row
Latitude: decimal latitude of site for which climate data was estimated
Longitude: decimal latitude of site for which climate data was estimated
Elevation: used in climateNA climate extraction - contains null data in all cases, recorded as -9999
Data/Species_by_reserve_herbdata_100min_full_LatLon_forClimateNA_Normal_1951_1980MSY.csv (File): 1951-1980 normal climate conditions associated with centroid of each UC reserve site
Number of variables: 270
number of cases/rows: 31
Variable List: The majority of column names are derived from ClimateNA output, and follow ClimateNA rubrics for naming and abbreviation of climate data (see universal key above, or https://ClimateNA.ca). Additional data fields are as follows:
id1: records UC reserve site corresponding to climate data
id2: unique ID for each row
Latitude: decimal latitude of site for which climate data was estimated
Longitude: decimal latitude of site for which climate data was estimated
Elevation: used in climateNA climate extraction - contains null data in all cases, recorded as -9999
Data/Species_by_reserve_herbdata_100min_full_LatLon_forClimateNA_Normal_1991_2020MSY.csv (File): 1991-2020 normal climate conditions associated with centroid of each UC reserve site
Number of variables: 270
number of cases/rows: 31
Variable List: The majority of column names are derived from ClimateNA output, and follow ClimateNA rubrics for naming and abbreviation of climate data (see universal key above, or https://ClimateNA.ca). Additional data fields are as follows:
id1: records UC reserve site corresponding to climate data
id2: unique ID for each row
Latitude: decimal latitude of site for which climate data was estimated
Longitude: decimal latitude of site for which climate data was estimated
Elevation: used in climateNA climate extraction - contains null data in all cases, recorded as -9999
Data/Species_by_reserve_herbdata_100min_full_LatLon_forClimateNA_13GCMs_ensemble_ssp126_2041MSY.csv (File): 2041-2070 projected normal climate conditions associated with centroid of each UC reserve site
Number of variables: 270
number of cases/rows: 31
Variable List: The majority of column names are derived from ClimateNA output, and follow ClimateNA rubrics for naming and abbreviation of climate data (see universal key above, or https://ClimateNA.ca). Additional data fields are as follows:
id1: records UC reserve site corresponding to climate data
id2: unique ID for each row
Latitude: decimal latitude of site for which climate data was estimated
Longitude: decimal latitude of site for which climate data was estimated
Elevation: used in climateNA climate extraction - contains null data in all cases, recorded as -9999
Data/Species_by_reserve_herbdata_100min_full_LatLon_forClimateNA_13GCMs_ensemble_ssp126_2071MSY.csv (File): 2071-2099 projected normal climate conditions associated with centroid of each UC reserve site
Number of variables: 270
number of cases/rows: 31
Variable List: The majority of column names are derived from ClimateNA output, and follow ClimateNA rubrics for naming and abbreviation of climate data (see universal key above, or https://ClimateNA.ca). Additional data fields are as follows:
id1: records UC reserve site corresponding to climate data
id2: unique ID for each row
Latitude: decimal latitude of site for which climate data was estimated
Longitude: decimal latitude of site for which climate data was estimated
Elevation: used in climateNA climate extraction - contains null data in all cases, recorded as -9999
Data/Reserve_area_ha.csv (File): Area in ha of each reserve site.
Number of variables: 2
number of cases/rows: 31
Variable List:
Reserve: Name of reserve site
area_ha: area of site in hectares
Data/Elev/site_clim_diffs.csv (File): Climate conditions and change between 1911-1940 vs 1991-2020 conditions at each UC reserve site.
Number of variables: 18
number of cases/rows: 31
Variable List:
id1: Name of reserve site
id2: marker
Latitude: decimal latitude of UC Reserve site
Longitude: decimal longitude of UC Reserve site
Elevation: null field produced by climateNA
MCMT_1911: mean temperature of the coldest month, 1911-1940 normal
MAT_1911: mean annual temperature, 1911-1940 normal
MWMT_1911: mean temperature of the warmest month, 1991-2020 normal
MCMT_1991: mean temperature of the coldest month, 1991-2020 normal
MAT_1991: mean annual temperature, 1991-2020 normal
MWMT_1991: mean temperature of the warmest month, 1991-2020 normal
MCMT_Change_1911_1991: change in temperature of the coldest month, 1911-1940 normal vs. 1991-2020 normal
MAT_Change_1911_1991: change in mean annual temperature, 1911-1940 normal vs. 1991-2020 normal
MWMT_Change_1911_1991: change in mean temperature of the warmest month, 1911-1940 normal vs. 1991-2020 normal
MAP_Change_1911_1991: change in mean annual precipitation, 1911-1940 normal vs. 1991-2020 normal
Reserve: Name of reserve site
Data/Predictions/Reserves_NormalAnomPreds_PredsDOY_Rect.csv (File): Predicted dates of onset and termination by each species under different climate scenarios.
Number of variables: 8
number of cases/rows: 37155
Variable List:
Accepted_name: Genus and species as matched to a standardized taxonomic schema using the taxonomic name resolution service (tnrs.biendata.org).
Reserve: Name of UC reserve site
Latitude: decimal latitude of UC Reserve site
Longitude: decimal longitude of UC Reserve site
Time Period: Time period being predicted - in all cases, represents 30 year normals in which the first year covered by that normal is recorded (e.g., a value of 1911 indicates climate normal for 1911-1940)
RQ_Pred_0.1_DOY_Rect: Predicted DOY by which the earliest 10% of individuals of a species will have begun flowering at a given location/time period
RQ_Pred_0.5_DOY_Rect: Predicted DOY by which the earliest 50% of individuals of a species will have begun flowering at a given location/time period
RQ_Pred_0.9_DOY_Rect: Predicted DOY by which the earliest 90% of individuals of a species will have begun flowering at a given location/time period
Data/Predictions/Reserves_NormalAnomPreds_RQ_SelectedParams_Simulated_Distribution_SlopeComp_DOY_Rect.csv (File): Model parameters for quantile regression models corresponding to flowering onset, median flowering date, and flowering termination date of each species modeled in this study.
Number of variables: 6
number of cases/rows: 21,417
Variable List:
Accepted_name: Rectified taxon name produced by TNRS, to finest ID possible
parameter: climate parameter used in model (corresponding to CLIMATENA climate parameter names as described in universal key above)
month: season of climate parameter being used in modeling (1 equates to Jan-March, 2 equates to April-June, 3 equates to July-September, 4 equates to October-December)
quantile: The quantile corresponding to the described slopes (0.1 equates to flowering onset date, 0.5 to median date, and 0.9 to termination date)
AIC: AIC of model including selected parameter
Slope: slope of parameter to be included in model (as days/unit, using units corresponding to the selected climate parameter, as described in universal key above)
Data/Community_outputs/Binary_Dailies_anoms.csv (File): Records predictions of whether a species is expected to be in flower on each day of the year (DOY 1 to 365) at a given reserve site in which it is known to grow.
Number of variables: 374
number of cases/rows: 37155
Variable List:
Accepted_name: Rectified taxon name produced by TNRS, to finest ID possible
Reserve: UC reserve site under consideration
Latitude: Latitude
Longitude: Longitude
Time_Period: time period under consideration (representing 30-year normals in all cases, recorded as the first year within the 30-year period under consideration)
RQ_Pred_0.1_DOY_Rect: Predicted Day of Year (DOY) on which a species will reach the date of its flowering onset at a given location and time period
RQ_Pred_0.5_DOY_Rect: Predicted Day of Year (DOY) on which a species will reach peak flowering at a given location and time period
RQ_Pred_0.5_DOY_Rect: Predicted Day of Year (DOY) on which a species will reach the date of its flowering termination at a given location and time period
DOY_Binary_**: columns beginning DOY_Binary\\\_ record whether a specific plant species is expected to be in flower at a given location under specified climate conditions. The day of the year represented by each data field is specified by the number appended to the end of each column name.
Data/Community_outputs/DailySum_anoms.csv (File): Daily sum of the number of species predicted to be in flower on each day of the year (DOY) from 1 to 365 at each site and time period.
Number of variables: 366
number of cases/rows: 37155
Variable List:
Accepted_name: list of all species potentially in flower
DOY_Binary_***: columns beginning DOY_Binary\\\_ record the nuber of species expected to be in flower on a given day of year (DOY) at a given site and time period. The day of the year represented by each data field is specified by the number appended to the end of each column name.
Data/Community_outputs/DailyProp_fl_anoms.csv (File): Average daily proportion of total species known to inhabit a site on each Day of year (DOY) from 1 to 365.
Number of variables: 366
number of cases/rows: 155
Variable List:
Reserve: UC reserve site under consideration
Time_Period: time period under consideration (representing 30-year normals in all cases, recorded as the first year within the 30-year period under consideration)
Latitude: Latitude
Longitude: Longitude
Accepted_name: List of names of all species at a given site
Total_Species: Total number of species documented to inhabit a given site
DOY_Prop_fl***: Daily proportion of total species known to inhabit a site that are predicted to be in flower on a given day. The day of the year represented by each data field is specified by the number appended to the end of each column name.
Data/Community_outputs/DailyProp_fl_anoms_smoothed.csv (File): Average daily proportion of total species known to inhabit a site that are predicted to be in flower across a seven-day window centered on the each Day of year (DOY) from 1 to 365.
Number of variables: 366
number of cases/rows: 155
Variable List:
Reserve: UC reserve site under consideration
Time_Period: time period under consideration (representing 30-year normals in all cases, recorded as the first year within the 30-year period under consideration)
Latitude: Latitude
Longitude: Longitude
Accepted_name: List of names of all species at a given site
Total_Species: Total number of species documented to inhabit a given site
DOY_Prop_fl***: Average daily proportion of total species known to inhabit a site that are predicted to be in flower across a seven-day window centered on the listed Day of year (DOY). The day of the year represented by each data field is specified by the number appended to the end of each column name.
Data/Community_outputs/DailyProp_fl_long_anoms.csv (File): Average daily proportion of total species known to inhabit a site that are predicted to be in flower across a seven-day window centered on the each Day of year (DOY) from 1 to 365. Records same informaiton as Data/Community_outputs/DailyProp_fl_anoms_smoothed.csv, but is restructured to include fewer columns and more rows.
Number of variables: 7
number of cases/rows: 56575
Variable List:
Reserve: UC reserve site under consideration
Time_Period: time period under consideration (representing 30-year normals in all cases, recorded as the first year within the 30-year period under consideration)
Total_Species: Total number of species documented to inhabit a given site
Latitude: Latitude
Longitude: Longitude
DOY: Day of Year (from 1 to 365)
DOY_Prop_Sm: Average daily proportion of total species known to inhabit a site that are predicted to be in flower across a seven-day window centered on the Day of year (DOY) listed in the DOY column.
Data/Paired_Comp/onset_comp.csv (File): Records statistical comparisons (using t-tests conducted using function ttest_rel) in the date of flowering onset across all taxa in each site under historical (1911-1940 normal) vs. either mid-century (1951-1980), contemporary (1991-2020), near-future (2041-2070) or mid future (2071-3000) conditions.
Number of variables: 8
number of cases/rows: 12
Variable List:
Time_Period: time period under consideration (representing 30-year normals in all cases, recorded as the first year within the 30-year period under consideration)
Parameter: delineates what parameter is being compared: either the day on which the earliest 20% of taxa at each site are expected to have begun flowering( DOY_20_onset), the day by which the earliest 50% of species are expected to have begun flowering (DOY_50_onset) or the date by which all but the latest 25% of taxa are expected to have begun flowering (DOY_75_onset)
Historical_Mean: Historical mean date of flowering onset across all taxa
Mean_Difference: Mean difference between historical (1911-1941) value and value corresponding to time period of interest
p_value: p_value of t_test
df: degrees of freedom for analysis
coef: Coefficient of regression
reg_p: p_value of regression
Data/Paired_Comp/term_comp.csv (File): Records statistical comparisons (using t-tests conducted using function ttest_rel) in the date of flowering termination across all taxa in each site under historical (1911-1940 normal) vs. either mid-century (1951-1980), contemporary (1991-2020), near-future (2041-2070) or mid future (2071-3000) conditions.
Number of variables: 8
number of cases/rows: 12
Variable List:
Time_Period: time period under consideration (representing 30-year normals in all cases, recorded as the first year within the 30-year period under consideration)
Parameter: delineates what parameter is being compared: either the day on which the earliest 20% of taxa at each site are expected to have ceased flowering( DOY_20_term), the day by which the earliest 50% of species are expected to have ceased flowering (DOY_50_term) or the date by which all but the latest 25% of taxa are expected to have ceased flowering (DOY_75_term)
Historical_Mean: Historical mean date of flowering termination across all taxa
Mean_Difference: Mean difference between historical (1911-1941) value and value corresponding to time period of interest
p_value: p_value of t_test
df: degrees of freedom for analysis
coef: Coefficient of regression
reg_p: p_value of regression
Data/Paired_Comp/dur_comp.csv (File): Records statistical comparisons (using t-tests conducted using function ttest_rel) in the duration of time across which different proportions of the species at each site are in flower in each site under historical (1911-1940 normal) vs. either mid-century (1951-1980), contemporary (1991-2020), near-future (2041-2070) or mid future (2071-3000) conditions.
Number of variables: 8
number of cases/rows: 12
Variable List:
Time_Period: time period under consideration (representing 30-year normals in all cases, recorded as the first year within the 30-year period under consideration)
Parameter: delineates what parameter is being compared: either the duration over the number of taxa in flower are at least 20% of the historical maximum at each site ( DOY_20_dur), the duration over which the number of taxable are at least 50% of the historical maximum (DOY_50_dur), or the duration over which the number of taxable are at least 50% of the historical maximum (DOY_75_dur)
Historical_Mean: Historical duration according to each metric across all sites
Mean_Difference: Mean difference between historical (1911-1941) value and value corresponding to time period of interest
p_value: p_value of t_test
df: degrees of freedom for analysis
coef: Coefficient of regression
reg_p: p_value of regression
Data/Paired_Comp/max_prop_comp.csv (File): Records statistical comparisons (using t-tests conducted using function ttest_rel) in the maximum proportion of species simultaneously in flower in each site under historical (1911-1940 normal) vs. either mid-century (1951-1980), contemporary (1991-2020), near-future (2041-2070) or mid future (2071-3000) conditions.
Number of variables: 8
number of cases/rows: 12
Variable List:
Time_Period: time period under consideration (representing 30-year normals in all cases, recorded as the first year within the 30-year period under consideration)
Parameter: delineates what parameter is being compared: in all cases, this is the maximum proportion of species in flower over any 7-day window, relative to 1911-1940 predicted maximum number of species in flower
Historical_Mean: Historical duration according to each metric across all sites
Mean_Difference: Mean difference between historical (1911-1941) value and value corresponding to time period of interest
p_value: p_value of t_test
df: degrees of freedom for analysis
coef: Coefficient of regression
reg_p: p_value of regression
Data/Paired_Comp/skew_comp.csv (File): Records statistical comparisons (using t-tests conducted using function ttest_rel) in the degree of skew in the seasonal distribution of species richness in flowering in each site under historical (1911-1940 normal) vs. either mid-century (1951-1980), contemporary (1991-2020), near-future (2041-2070) or mid future (2071-3000) conditions.
Number of variables: 8
number of cases/rows: 12
Variable List:
Time_Period: time period under consideration (representing 30-year normals in all cases, recorded as the first year within the 30-year period under consideration)
Parameter: delineates what parameter is being compared: in all cases, this is the degree of skew in the seaosnal distribution of species richness of flowers, relative to 1911-1940 degree of skew.
Historical_Mean: Mean historical degree of skew across all sites
Mean_Difference: Mean difference between historical (1911-1941) value and value corresponding to time period of interest
p_value: p_value of t_test
df: degrees of freedom for analysis
coef: Coefficient of regression
reg_p: p_value of regression
Data/HerbData (Folder): Empty folder. In order to run the scripts included in this repository, the supplementary file Herbarium_AllData_Core_noDups.csv should be placed in this folder. This dataset is included as supplementary data rather than as part of the main dryad repository due to different licensing requirements.
METHODOLOGICAL INFORMATION
Methods for processing the data:
Python and R scripts should be run in the following order:
0_Reserves_plantlist_herb_extraction.ipynb: isolates relevant herbarium observations, extracts specimens of species that are known to inhabit UC Reserve sites
01_Reserves_Clim_Norm_Assembly-Anoms.ipynb: identifies those species with sufficient observations to be used in this analysis, merges those records with annual and normal climate data for use in phenoclimate modeling
02_Herb_Predict_Reserve_Norm_Anom_normals2.rmd: runs phenoclimate models using herbarium data, predicts phenological onset & termination dates of each species within each UC reserve in which it occurs under historical, contemporary, and projected future conditions
03_Display_Assessment-anoms.ipynb: Converts species-level phenological predictions intop community-level assessments of the daily species richness of actively flowering species
04_Display_analysis-anoms.ipnb: Visualizes community-level daily species richneses of actively flowering species under historical and contemporary conditions, as well as the magnitude of changes in daily species richnesses of actively flowering species under historical and contemporary conditions at each UC Reserve site.
05_Thresholds_anom.ipnb: Conducts statistical analyses of whether systemic changes have occurred in attributes of community-level bloom display between historical and contemporary climate conditions.
06_Species_patterns_anoms.ipynb: Visualises and conducts statistical analyses of patterns in species- level changes in flowering onset dates or flowering duration among species known to inhabit UC Reserve Sites.
3\. Instrument- or software-specific information needed to interpret the
data:
Python code was run python version 3.9.7, and requires the installation packags listed in the file pheno.yml included with this data. R code was run using Rstudio version 4.0.0 and requires installation of the following packages:
data.table v1.14.2
plyr v1.8.7
MCMCpack v1.6-2
quantreg v5.88
DataCombine v0.2.21
The University of California Reserve sites used in this study consisted of all UC reserves for which plant lists were available, and for which the local floras included at least 100 angiosperm taxa. Plant lists used in this study were posted on-line by each reserve and assembled by Brian Haggerty and Susan J. Mazer (https://ucnrs.org/a-flora-for-the-nrs/z). To ensure that each site represented a community of plants located within a small enough area that they might reasonably be considered sympatric or be accessible to shared pollinators, we further excluded all reserve sites covering areas exceeding 3,000 ha, or in which the range of elevations within the site exceeded 250 meters. The remaining sites included 16 distinct locations distributed across California, including both coastal, inland, and mountainous sites, and represented a range of ecoregions and vegetation classes.
Herbarium Data
Records of flowering phenology used in this study were drawn from 9,216,145 digital specimen records acquired from the digital archives of 440 herbaria throughout North America (Park et al. 2023). In order to ensure the quality of the dataset under examination, we excluded from the data set analyzed here: all specimens not recorded as being in flower at the time of collection; duplicate specimens of a given species (i.e., specimens collected on the same DOY (day of year), in the same year, and at the same location) and specimens for which the date of collection, the latitude and longitude of the collection site, or the species name was not available. As taxonomic nomenclature has varied both over time and regionally across the historical record represented by this dataset, we were concerned that changes in taxonomic nomenclature generated ambiguity in the species’ identifications represented in the data. To rectify this, we standardized the taxonomic nomenclature used to describe each specimen using taxonomic identifications provided by the Taxonomic Name Resolution Service iPlant Collaborative, Version 4.0 (Boyle et al., 2013, Accessed: 30 August 2021; https://tnrs.biendata.org/), which matched outdated or ambiguous identifications to a standardized taxonomic nomenclature.
To ensure that a sufficient number of specimens were observed for each species to model the timing of the bloom display for that species, we also eliminated all species represented by fewer than 100 unique records.
We then identified all species that were reported within at least one of the University of California Reserves based on the species lists mentioned above (https://ucnrs.org/plant-list/). Finally, we eliminated all specimen records of species not present in the species list of at least one reserve. The remaining data encompassed 1,908,706 flowering specimens representing 1,848 species in 234 angiosperm families distributed across North America. Detailed methods and code used to prepare this dataset for the analyses presented here can be accessed via DRYAD at https://doi.org/10.5061/dryad.0gb5mkm9d.
Climate data
Climate conditions in the year and location of each collection were then estimated using the ClimateNA v7.21 software package available at http://tinyurl.com/ClimateNA (Hamann et al. 2013). Climatic parameters evaluated here consisted of mean annual temperatures of Winter (January-March), Spring (April-June), Summer (July-September) and Fall (October-December) months. Similarly, long-term normal conditions (represented by average Winter, Spring, Summer, and Fall temperatures at each location over the years 1911-1940 and 1991-2020) were then estimated at the location from which each specimen was collected, as well as the centroid of each UC Natural Reserve using similar methods. To separate the effects of phenotypic differences in flowering phenology among populations occupying different collection sites that experienced different long-term climate conditions from the effects of plastic responses in flowering phenology associated with year-to-year variation in climate, we calculated for each climate parameter the annual anomaly associated with each digital specimen. That is, for each specimen represented in the digital data, we calculated the difference between the annual conditions in the year and location of specimen collection and the 1911-1940 normal conditions at that location. Positive values for this difference indicate that a given specimen was collected in a warmer-than-average or wetter-than-average year. Sites within the UC Natural Reserve System that were used in this study ranged in mean annual temperature from 4.3°C to 16.7° (Table 1). Note that, for this study, the phenological sensitivity of each species to interannual climate variability was considered to be constant throughout the range of each species.
Modeling plant phenology
We developed species-specific phenological models for each species from the available digital specimen records. Using the available data for each species, we regressed the observed DOYs against local historical climate normals (represented by 1911-1940 January - March, April-June, July-September, or October-December mean temperatures) and anomalies (represented by annual departures from 1911-1940 January - March, April-June, July-September, or October-December mean temperatures) using quantile regression at the 10th percentile (representing population-level flowering onset) and 90th percentile (representing population-level flowering termination) for each species. As Winter, Spring, Summer, and Fall temperatures are often highly collinear, each model was allowed to select temperature normals and anomalies from only the single season that best explained observed phenological variation (based on Akaike information criterion of models created using temperatures from each season).
These methods have previously been demonstrated to accurately predict the dates of flowering onset and termination for species’ population-level bloom displays in response to a given set of climate conditions using simulated natural history collections data (Park et al. 2024). The predicted duration of each species’ bloom display at each location in which it occurred under each climate scenario was then estimated as the difference in DOY between the predicted dates of population flowering onset and termination. To avoid conflating phenological shifts that resulted from plastic responses of individual plants with population-level phenotypic differences in phenology that occur along climate gradients (which can be generated by both plasticity and evolutionary change), both seasonal temperature normals and anomalies were evaluated for inclusion in each model. The partial regression coefficients associated with temperature normals are interpreted as a combination of the phenotypic differences in phenological timing among populations inhabiting locations with differing climatic conditions and the plastic responses to spatial variation in local temperature, while the partial regression coefficients associated with the climate anomalies are interpreted as solely plastic responses to inter-annual variation in local temperature.
Assessing patterns of change in species-level flowering time and duration
Mean changes in species-level flowering times relative to historical (1911-1940 normal) climate conditions across the UC reserve sites examined in this study were assessed by calculating the predicted onset (10th percentile) and termination (90th percentile) of each species’ flowering within each site in which it occurred under both historical (1911-1940) and contemporary (1991-2020) normal climate conditions. Changes between historical and contemporary DOYs of flowering onset, DOYs of flowering termination, as well as in the duration of the local flowering period, were then assessed for each species in each UC Reserve site using pairwise means comparisons.
Predicting historical and contemporary flowering periods within reserves
Using the resulting phenoclimate models, the DOYs of flowering onset and termination were then predicted for every species documented to occur within each site using the climate conditions located at the centroid of that site using ClimateNA v7.21. Typical flowering times during the early 20th century were represented by predictions of flowering onset and termination by each species under 1911-1940 normal temperatures at each site in which it occurred, while typical flowering times during recent decades were represented by flowering times of each species under 1991-2020 normal temperatures. All sites examined in this study experienced warming between these two time periods, with increases in mean annual temperature (MAT) ranging from 0.8°C to 1.4°C.
Using the predicted DOYs of flowering onset for each species within each reserve in which it occurred, we then calculated the number of species predicted to be in flower throughout the (broadly defined) flowering season under historical (1911-1940) and modern (1991-2020) temperatures at each site. To compensate for differences in taxonomic diversity among locations, relative species richness of the taxa expected to be flowering at each site and on each DOY was then calculated as the number of species predicted to be in flower on that DOY divided by the total number of species documented to inhabit that site.
Evaluating changes in peak species richness of the bloom season
Peak species richness of active flowers within each site under historical and modern conditions was measured as mean number of species predicted to be in flower within a UC Reserve site and under a given climate scenario during the 15-day period in which floral species richness (the number of active, co-flowering species) was predicted to be the highest (note that the timing of this period could differ between sites as well as between historical and contemporary climate scenarios). We then evaluated whether differences between modern (1991-2020) and historical (1911-1940) climate conditions resulted in systemic changes to peak daily species richness of flowers across the UC reserve sites using pairwise t-tests. We further evaluated whether the greater temperature changes over the study period were associated with greater shifts in peak daily richness of flowers by regressing the observed shifts in peak daily species richness of flowers within each UC reserve against the observed change in mean annual temperature within that site.
Evaluating changes in timing and duration of the bloom season
The bloom season was defined in this study as the period of time between the first day of the year on which species richness of active flowers exceeded 20% of the local historical maximum, and the last day of the year on which species richness of active flowers exceeded 20% of the local historical maximum. The duration of the bloom season (i.e., that portion of the year in which the majority of flowering occurs) was measured as the number of days during which species richness of blooming taxa remained above 20% of the historical maximum for that site. This duration was calculated separately for each site under both historical and modern climate conditions. We then tested whether systemic changes to the onset date, termination date, and duration of the bloom season had occurred in response to recent climate changes by comparing durations of the flowering season under historical and modern climate conditions using pairwise t-tests. These analyses were conducted to evaluate whether greater site-specific temperature changes over the study period were associated with greater shifts in onset date, termination date, and duration of the bloom season by regressing the observed change of each of these metrics within each UC reserve against the observed change in mean annual temperature at that site. In order to evaluate whether recent climate change had impacted the duration of high species richness of flowers across these sites, we also tested whether the shifts in the duration of the peak bloom season (defined as the number of days in which each site exhibited >75% of its peak historical maximum richness) had occurred in response to recent warming using similar methods.
Climate warming might also cause changes in the patterns of daily species richness of actively flowering species throughout the bloom season that would not be captured through examinations of changes in the duration, timing, or peak richness of active flowers within the bloom season at each site. For example, warming conditions could result in decreased daily species richness of flowers during some portions of the season. Such changes could be evaluated, however, by determining the magnitude of daily changes in species richness of active flowers that occurred within each site as a response to warming conditions. Thus, to evaluate whether larger magnitudes of warming were associated with greater changes in daily species richness of blooming plants throughout the growing season, for each reserve we calculated the mean absolute departures (or mean absolute errors) between daily historical (1911-1940 normal) and contemporary (1991-2020 normal) species richnesses of flowering plants throughout the historical bloom season of the flora at that reserve (defined as the period between the DOY on which that flora’s relative species richness of active flowers reached 20% of its maximum under historical conditions, and the DOY on which it subsequently fell below 20%). Mean absolute departures measure the degree to which warming produced changes to species richness of species in flower within each site. Accordingly, this metric can be thought of as a general estimate of the mean magnitude of change in daily richness of flowering species that occurred throughout the growing season in response to climate warming, irrespective of whether such changes involved increases or decreases in daily species richness of active flowers. In order to determine whether greater warming was associated with greater overall change in patterns of daily species richness of flowers, we then regressed the mean absolute departures in relative daily species richness among sites against the observed differences in mean annual temperature that had been observed between historical (1911-1940) and contemporary (1991-2020) time periods at each site.
Evaluating climate-driven shifts to skewness in seasonal distributions of floral diversity
As individual species collectively shift their flowering phenology in response to changing climate conditions, this may result not only in changes to the timing and diversity of floral resource availability, but also to the shape of the distribution of floral resources across the growing season. Thus, we calculated the skew of the distribution of the species richness of floral resources throughout the growing season under both historical and modern conditions, and then used pairwise t-tests to determine whether phenological responses to modern (1991-2020) conditions exhibited significantly different skew from floral displays under historical (1911-1940) conditions.
Evaluating systemic shifts in species-level phenological timings and durations
Changes in the timing, duration, and peak richness of the community-level flowering display could be caused not only by differences among taxa in the timing of their flowering but by systematic reductions or increases in their flowering durations. For example, earlier onset of the bloom season could be explained by advances in the timing of a small percentage of species, or by systematic advances across the majority of taxa occupying a given site. Similarly, declines in daily species richness could be explained by increased separation in the timing of flowering onset among taxa in response to warming conditions, or by consistent reductions in flowering duration. Thus, in order to determine whether the changes we observed in community-level flowering metrics were due to systematic changes in the timing of flowering or the duration of flowering, we tested whether the plant species examined in this study had collectively exhibited significant shifts between historical (1911-1940) versus modern (1991-2020) conditions in flowering onset, peak flowering, flowering termination, or flowering durations of each species under using pairwise t-tests.
