Data from: Incorporating local information to predict thermal stress for diverse species
Data files
Feb 12, 2025 version files 56.51 MB
-
Coho_spatialdistrib_abovedams.csv
1.16 MB
-
Coho_spatialdistrib.csv
16.01 MB
-
README.md
4.20 KB
-
STL_spatialdistrib_abovedams.csv
18.42 MB
-
STL_spatialdistrib.csv
20.91 MB
Abstract
Pacific salmonids are incredibly diverse and critical for both ecosystems and human consumers. Although salmon conservation recognizes the importance of diversity for viability, most previous studies have oversimplified phenology and life history diversity that dictate local environmental exposure and influence responses to climate change. I combined subpopulation-level spatial distributions and phenologies with monthly stream temperature to explore modern and future patterns in freshwater thermal stress for 449 subpopulations across 21 coho salmon (Oncorhynchus kisutch) and steelhead trout (O. mykiss) management units, 14 of which are listed as either Threatened or Endangered under the United States Endangered Species Act. Under modern conditions, 37% of coho and 90% of steelhead subpopulations were exposed to thermally stressful conditions. Under a simple 2°C climate warming scenario, 91% of subpopulations and the majority of subpopulations in 20 of 21 management units would be thermally stressed during at least one life stage. For diverse species like salmon, incorporating local-scale phenology, spatial, and climatic information is imperative to identify subpopulations that will be negatively impacted by future climate warming without mitigating actions.
https://doi.org/10.5061/dryad.x0k6djht4
Description of the data and file structure
This README file is associated with data and analyses from publication "Incorporating local information to predict thermal stress for diverse species" by FitzGerald, A. M., 2025, Canadian Journal of Fisheries & Aquatic Sciences. The Coho salmon and steelhead trout spatial distribution datasets are csv files (n = 4), R scripts are provided to replicate data (n = 2), and supplemental results are provided (n = 2). Each csv dataset has 51 columns, detailed below. The files are: "Coho_spatialdistrib.csv" (35,374 rows, including headers); "Coho_spatialdistrib_abovedams.csv" (2,546 rows, including headers); "STL_spatialdistrib.csv" (50,984 rows, including headers); and "STL_spatialdistrib_abovedams.csv" (39,989 rows, including headers). Each row represents a single occurrence within a stream reach. Metadata for columns as follows: 'OccurrenceUniqueID' is a unique identification number. 'Latitude' and 'Longitude' are in a WGS-84 geographic coordinate system. 'ESU' (Evolutionarily significant unit, for coho salmon) or 'DPS' (Distinct population segment, for steelhead trout) represent the management unit. 'Status' is the United States federal listing status under the Endangered Species Act. 'ESU_Stream' represents the name of the phenology group. This is how spatial location was linked with phenology. 'RunType' is a generic descriptor of the seasonal freshwater entry timing for salmonid populations, and were either obtained from the original data source or were defined by the author based on spatial or temporal information. Columns 8:22 ('OBSPRED_ID' to 'BFI') are directly from the stream temperature modeling (FitzGerald et al. 2021 Global Change Biology), and were spatially joined to each occurrence point. Specifics for those columns may be found in that paper. Columns 23:34 ('Water_Jan' to 'Water_Dec') are the estimated stream temperatures (degrees C) for each month from the stream temperature model, and were spatially joined to each occurrence point. These temperatures were used to calculate thermal exposure. Columns 35:44 ('Peak_Arr_mo' to 'Peak_Outmig_mo') show the peak month of each life stage and/or the duration of that life stage, obtained from the phenology database or calculated based on thermal exposure, when applicable (see details in paper). Columns 45:51 ('ARRIVALpk' to 'REARINGcore') are the stream temperatures for each life stage for each occurrence, based on the phenology database. Here, 'pk' means the peak month of that life stage, and 'core' means the core duration of that life stage, from the peak of the first life stage to the peak of the subsequent life stage (see details in paper). The R script "FitzGerald_2025_IncubationEstimates.R" compares estimates of coho incubation using the Effective Value model (see Sparks et al. 2019 CJFAS) and the iterative approach used in this study. This script was originally provided by an anonymous reviewer and adapted by AMF. This script imports the data file "Coho_spatialdistrib.csv" to compare methods. The second R script, "FitzGerald_2025_STL.R", replicates the steelhead trout results, figures, and tables. To replicate coho salmon results, simply read in the coho datasets in place of steelhead, and factor by coho ESUs (groups) rather than steelhead DPSs. The two supplemental files are named "cjfas-2024-0189suppla.docx" and "cjfas-2024-0189supplb.docx". These supplements provide supplemental tables ("cjfas-2024-0189suppla") and figures ("cjfas-2024-0189supplb") that are referenced in the manuscript.
Sharing/Access information
Data was derived from the following sources
- FitzGerald, A. M., John, S. N., Apgar, T. M., Mantua, N. J., & Martin, B. T. (2021). Quantifying thermal exposure for migratory riverine species: Phenology of Chinook salmon populations predicts thermal stress. Global Change Biology, 27(3), 536-549. https://onlinelibrary.wiley.com/doi/full/10.1111/gcb.15450
Spatial distribution datasets
Coho (n = 102,288; July 2020) and steelhead (i.e., the anadromous form of O. mykiss) observations (n = 154,724; August 2021) in California, Oregon, Washington, Idaho, and Montana were extracted from six point-observation and distribution data sources. I removed any observations that were outside of a DPS (based on DPS distribution shapefiles, obtained from http://www.westcoast.fisheries.noaa.gov/maps_data/Species_Maps_Data.html)to avoid fish stocking and transplants. I next vetted observations that had a georeferenced accuracy > 500m, were greater than 500 m from the stream network, observed prior to 1993 (to match the temporal purview of the stream temperature modeling; see below), or were duplicated. This resulted in a single observation per stream segment. Observations in Idaho or Montana were removed because the stream temperature predictions did not include those states (FitzGerald et al. 2021), but I mention them above for completeness. Next, each remaining observation was defined by freshwater life stage and joined with the appropriate phenology. Uncommonly, observations with an unspecified life-history type and without a date were present in a location with more than one life-history type, so I duplicated the spatial distribution in those areas so that I had observations from each life-history type.
Spawning and redd observations were extracted, and I refer to these observations in-text as “modern”. Because I evaluated a single observation per stream km due to a lack of standardization across survey methods, these modern observations correspond to each subpopulation’s spawning spatial distribution. Because data collection was not standardized, I treated these observations as presence data rather than as true estimates of density or abundance. Although presence data can be biased based on sampling effort and detection probability, Pacific salmon are some of the most well-studied, longest-studied, and most recognizable species by both researchers and citizens, such that the presence data should be treated as true presences.
To define historically accessible reaches above dams, I first extracted all stream reaches in the “historical watershed: anthropogenically blocked” boundary for each DPS (NOAA Fisheries Species Ranges - Salmon and Steelhead [West Coast Region], https://www.fisheries.noaa.gov/resource/map/species-ranges-salmon-and-steelhead-west-coast-region). I added two additional updates to the above “blocked” boundary layer. First, habitat in the Elwha River watershed was removed because this area is no longer blocked by dams. Second, habitat upstream of Nicasio Dam in the Lagunitas watershed was added, despite not being included in the “blocked” boundary layer. Dams have blocked a total of ~10,220 and 56,507 stream km in coho and steelhead watersheds, respectively. Some of these reaches were likely inaccessible to salmonids due to environmental barriers, so I next eliminated reaches upstream of steep gradients (coho: slope > 5% gradient; steelhead: slope > 12% gradient [Burnett et al., 2003; Agrawal, 2005]) or upstream of natural barriers like waterfalls (GIS Data Sets - StreamNet). Reaches upstream of anthropogenic barriers like dams or culverts were included with the idea that fish passage could be incorporated in the future. Next, I removed reaches with flow conditions that likely prevent spawning for most individuals (mean annual flow < 2.11 cfs [0.06 cms]; Burnett et al., 2003; Agrawal, 2005). Coho and steelhead, unlike Chinook salmon, can readily exploit intermittent or ephemeral streams which were included in this analysis. Third, I defined each subpopulation’s “above dams” habitat. In other words, if a dam was removed, I defined which subpopulation(s) would occupy that habitat. Finally, I assumed that each potential subpopulation above dams would exhibit the same phenology as its modern subpopulation. I refer to this potential habitat above dams as “above dams”. For coho, I analyzed 2,281 km above dams for 21 subpopulations across 4 DPSs. For steelhead, I analyzed 32,741 km above dams for 103 subpopulations across 11 DPSs.
Temporal (phenology) distribution database
I reviewed metadata and primary sources on coho and steelhead to develop a phenology database for subpopulations in California, Oregon, and Washington. Specifically, I defined the timing (range, duration, and peak) for adult arrival on the spawning grounds, spawning, and smolt outmigration for 210 coho subpopulations across 7 DPSs and 451 steelhead subpopulations across 14 DPSs. Note that some subpopulations with phenology data did not have associated observations below dams, but I list all subpopulations with phenology data for completeness. From this phenology database of adult arrival, spawning, and smolt outmigration, I was able to calculate phenology for other life stages: pre-spawn holding, incubation, emergence, and natal rearing. In general, the window of each life stage was defined between the peak months of the life stage immediately preceding and following it. In other words, the window of the holding period was defined from peak arrival to peak spawning, the incubation period was defined from peak spawning to peak emergence (see below), and the rearing period was defined from peak emergence to peak smolt outmigration. Although some individuals may occur outside of these windows (e.g., early arrivals prior to peak arrival or late spawners after peak spawning), this definition encompasses most individuals. Note that “pre-spawn holding” for subpopulations spawning a few days after arriving to the spawning grounds resulted in identical arrival and spawning months and a holding duration of 0.
Emergence timing was not directly observed for most subpopulations. I therefore calculated emergence timing based on relationships between thermal exposure and development time. For coho, I calculated emergence timing from spawn timing using the Bèlehrádek (1930) model (Alderdice and Velsen 1978, Crisp 1981, Beacham and Murray 1990, Spence 1995) (Eq. 1):
Eq. 1 y = a / (T - c)^b
where y is the predicted number of days from fertilization to emergence, T is the mean incubation temperature, and a, b, and c are constants, parameterized as a = 1164.45, b = 1.08, and c = -2.21 for coho emergence by Spence (1995; n = 94, R2 = 0.976). Because temperature changes seasonally and we don’t have a good grasp of y in this study, we also don’t directly know T, the average temperature over the entire incubation period. To try to estimate y without directly knowing T, I applied a coarse iterative approach as follows. First, I calculated y by assuming that the mean incubation temperature (T) was represented by the temperature of the peak spawning month (n = {1}). If the emergence estimate (y in terms of months) was greater than n (here, 1), I assumed that incubation had to have occurred in at least two consecutive months (n = {1, 2}). I then calculated y using mean T of those two months (n = {1, 2}), and if y > n, I assumed that incubation had to have occurred in at least three consecutive months (n = {1, 2, 3}). I repeated this process iteratively until the emergence estimate (y in terms of months) equaled the number of months inputted for temperature (i.e., y = n). This approach is somewhat similar to the Effective Value model developed by Sparks et al. (2019) for sockeye salmon and implemented by Adelfio et al. (2024) for other salmonids, and both models predicted length of incubation similarly in my dataset.
For steelhead, I developed a model to calculate the length of incubation based on temperature. I included data from a variety of subpopulations, spanning incubation temperatures from 2-18°C (Embody 1934, Wales 1941, Reynolds et al. 1990, Turner et al. 2007, Quinn 2018). Note that most of these observations (n = 34) were from laboratory experiments, usually under constant incubation temperatures with oxygen saturation, and that higher temperatures were associated with higher embryonic mortality (Turner et al. 2007). I fit a generalized additive model (GAM) with the form (Eq. 2):
Eq. 2 d ~ s(T)
where d is the number of days it took to hatch at T, the average incubation temperature (°C), and s is a smooth function. The model fit the data well with adjusted R2 = 0.989 (Figure 3B). I then estimated the length of incubation for mean T during incubation using the iterative approach above described for coho. Emergence for steelhead was then presumed to be 2–8 weeks post-hatching. For both coho and steelhead emergence models, I converted the number of days from peak spawn timing to emergence into months to match the stream temperature resolution (see below). I did not calculate emergence timing if average incubation temperature was reported to be < 0.6°C or > 17°C as these temperatures result in 100% mortality of embryos (Tang et al. 1987). These approaches based on mean monthly temperatures are coarser than the daily temperature used in the Effective Value model such that inputting daily temperatures would result in more accurate estimates of emergence timing. Because I used mean temperature over the estimated incubation length and converted to a coarse temporal resolution (monthly), the emergence estimate is approximate.
Spatiotemporal distributions linked to stream temperature
I linked each subpopulation’s modern spatiotemporal distribution and above dams habitat with spatially and temporally continuous temperature to estimate thermal exposure. I obtained the mean monthly stream temperature from 2002–2011 (Scenario 2 sensu Isaak et al., 2017), hereafter referred to as “modern” temperatures, from FitzGerald et al. (2021). In brief, stream temperature was predicted using a spatial stream network (SSN) model (Isaak et al., 2017) applied to the National Stream Internet (NSI) network (Nagel et al., 2015) at a ~1 km reach resolution for each month of the year. The SSN statistical regression model is ideal for streams because it accounts for spatial autocorrelation due to stream-network structure (Peterson and Ver Hoef, 2010; Ver Hoef and Peterson, 2010; Isaak et al., 2014; Isaak et al., 2017). The model fit was then used to predict mean monthly stream temperature for 465,775 river km spanning most of California, Oregon, and Washington. Predicted error for the out-of-sample testing dataset supported the use of the prediction dataset (mean RMSE = 1.351°C; mean MAPE = 0.915°C; r2 = 0.928). Additional stream temperature modeling details and results can be found in FitzGerald et al. (2021).
Based on previous research in the western U.S., streams in this study area are projected to warm by approximately 1°C by ~2040 and 2°C by ~2080 (Isaak et al. 2017). I applied these future stream temperature increases to every river km in order to simulate future stream temperatures and evaluate thermal stress under two simple, yet informative, climate change scenarios (FitzGerald et al. 2021).
Analyses
I compared thermal exposure to federal salmonid thermal criteria (thresholds). In brief, these criteria, which are specific to each life stage, were developed to protect salmonids in freshwater. Thermal stress at any 1 km stream segment was defined when a life stage exceeded its thermal criterion: adults holding prior to spawning > 16°C, spawning, incubation, or emergence > 13°C, and rearing > 16°C. These thresholds do not denote upper thermal limits but rather the temperatures above which performance declines.
First, I quantified mean monthly thermal exposure on the spawning grounds for each subpopulation for each life stage – from arrival through outmigration – to portray how thermal exposure changes throughout the freshwater life cycle. I next assessed thermal stress by quantifying the proportion of each spatial distribution (i.e., the proportion of stream length from each subpopulation) that exceeded thermal criteria in the warmest month of exposure during pre-spawn holding, incubation, and rearing (note that the warmest month for these life stages does not include fringe months that occur outside of the peaks for each life stage). I evaluated thermal exposure and thermal stress under modern temperatures and the two future climate change scenarios for both modern observations and historical spatial distributions above dams. I compared modern thermal stress levels below vs. above dams using Welch’s two-sample t-tests and quantified the magnitude of thermal stress increases with stream warming. For subpopulations with a minimum of 10 stream km, a subpopulation was considered thermally stressed if ≥ 25% of that subpopulation’s habitat was stressed for at least one life stage. While defining thermal stress above this threshold was somewhat arbitrary, the majority of Pacific salmon populations require significant improvements in survival to meet their recovery goals, such that any additional stress that impacts abundance may reduce recovery likelihood. Finally, I ran a series of simple and multiple linear regression models for each life stage to determine if thermal stress at modern observations was better predicted by latitude, elevation, arrival month, length of holding period, or any combination of the above factors. These factors are suspected to influence thermal exposure and stress in other salmonids. First, I transformed and scaled each factor and determined that all factor pairs were not strongly correlated. Models were compared using AIC. All analyses and models were run in R v.3.6.2.
- FitzGerald, Alyssa (2025). Data from: Incorporating local information to predict thermal stress for diverse species. Zenodo. https://doi.org/10.5281/zenodo.14728671
- FitzGerald, Alyssa (2025). Data from: Incorporating local information to predict thermal stress for diverse species. Zenodo. https://doi.org/10.5281/zenodo.14728670
- FitzGerald, Alyssa M (2025). Incorporating local information to predict thermal stress for diverse species. Canadian Journal of Fisheries and Aquatic Sciences. https://doi.org/10.1139/cjfas-2024-0189
