Data from: Reconstructing salmon growth trajectories through biochronologies across a highly variable growthscape
Data files
Apr 08, 2025 version files 1.31 MB
-
exog_intercept_20210430.csv
147.06 KB
-
increments.csv
1.15 MB
-
README.md
10.70 KB
Abstract
Measuring the growth of migratory fish across habitats is difficult because field observations only provide a snapshot into their life; yet, understanding which habitats provide better growth opportunities is crucial for their conservation. We experimentally enclosed individually tagged juvenile Chinook Salmon (Oncorhynchus tshawytscha) in habitats with known differences in growth potentials to evaluate four different models (Dahl-Lea, Fraser-Lee, Biological Intercept, Modified Fry) used to back-calculated size-at-age from otoliths. We found that otolith-derived fish size reconstructions were most accurate using the Biological Intercept or Modified Fry model, though bias remains for slow growing fish. This tool was then used in a case study to reconstruct the mosaic of inter- and intra-habitat growth opportunities available to fishes, providing a useful framework for assessing and monitoring fish responses to habitat restoration and a changing environment.
https://doi.org/10.5061/dryad.9zw3r22pt
This data repository includes two Excel file datasets, exog_intercept_20210430.csv and increments.csv; three R scripts, 01_data_prep.R, 02_oto_model_comparison.R, and 03_oto_model_application.R; six figures; and three tables. All missing data are represented as NA.
Description of the data and file structure
Data Files
Exog_intercept_20210430.csv
Description: large dataset of juvenile fish fork length sizes and otolith sizes from various projects, including ones outside of the scope of this project. Excel consists of wide data, where each individual is represented by only one row.
Variables:
-
lab_id: unique identfier assigned to individual fish.
-
project_code: project- specific identifiers, consisting of SBP (Sutter Bypass), DM (Delta monitoring), TT (Tidal Parr), SPR (spring-run), and WR (winter-run). Only SBP are included in our analysis.
-
collect_date: date of fish capture, Month / Day / Year format.
-
year: collection year in which the fish were captured (as opposed to Water Year).
-
run: denotes which of the runs of salmon each individual is (Fall, Winter, and Spring). Please note that late fall and fall are both in the “Fall” category.
-
site_id: unique site identifier, though not exclusive to fish within the scope of this project. Generally consistent with the acronyms of caged fish “location” column of the Increments.csv file. Relevant acronyms are decribed below in the Increments.csv variable descriptions.
-
location: location of where the fish was captured (more specific than “region” variable). Does not include acronyms. Consisent with “Region” in Increments.csv.
-
region: location of where the fish was captured within the watershed (more general than the “location” variable).
-
cwt_no: coded wire tag number associated with an individual fish.
-
fork_length: wet fork length (mm) of fish at capture.
-
sex: sex of the captured fish.
-
adipose_fin: denotes whether the adipose fin is present or absent. “Absent” represents a fish that had its adipose fin removed and is of hatchery origin, “Present” represents a fish that has an adipose and may potentially be wild, and “Unknown” represents a fish with either unrecorded or uncertain adipose presence.
-
age_class: denotes whether fish is a juvenile or an adult.
-
exog_dist: distance (um) of the otolith’s most posterior (back-most), dorsal (upper-most) primordia to the exogenous feeding increment.
-
edge_dist: distance (um) from otolith’s most posterior, dorsal primordia to its edge. Also referred to as the otolith radius.
-
Notes: section to write notes about specific samples; not standardized.
Increments.csv
Description: back-calculated fork lengths, weights, capture dates, locations, and increment data derived from otoliths (processed using ImagePro Premier) for experimentally enclosed and wild-caught fish. Excel consists of long data, where individual fish may be represented by multiple rows.
Variables:
-
Sample_ID: unique identfier assigned to individual fish.
-
Tag: denotes where important markings are on the fish otolith. Includes fish hatch (HATCH), exogenous feeding zone (EXOG), and the edge of the otolith (EDGE). All other cells are NA (normal increment, no specical significance) or CHECK, (an area of potential importance for feeding in the fish’s life).
-
Inc_no: denotes the fish’s age, in days since exogenous feeding beginning. Represented by increment number.
-
Inc_distance: distance (um) from otolith’s most posterior, dorsal primordia to the specific increment represented by the row.
-
FL_mm: the final fork length measurement (mm) of the individual fish, or fork length at capture.
-
Between_inc_dist: distance (um) between two adjacent increments. Cell subtracts the prior increment distance from the increment distance in the same row.
-
Experiment: denotes whether the fish had been experimentally enclosed (Cage) or wild-caught (Wild).
-
Genetics_specific: results of genetic testing for a subset of fish, showing their stream of origin.
-
Location: consists of 23 different sampling location names. Acronyms include SBKR (Rice field north of Kirkville Road), BCGR (Butte Creek near Sanborn Slough), BCLR (Butte Creek at Laux Road), BCMR (Butte Creek at Mallard Ranch), BSMR (Mallard Ranch), BSSS (Sanborn Slough), FRLF (Lundberg Pier), RGSAC (River Garden Farms), RIC (Sutter National Wildlife Refuge Inlet Canal), SACCL (Colusa Landing), SACTIS (Tisdale Weir), SACWL (Ward’s Landing), SBFR (Bean field), SBLF (Lundberg Farms, West Borrow), SBSNWR (Sutter National Wildlife Refuge, East Borrow), SBWS (Willow Slough), SRCCL (Sacramento River at Colusa Landing), SRSCCW (Sacramento Side Channel at Colusa Weir), XSSAC (Conaway Pump). Other names (non-acronym) include the Boat channel at Sanborn Slough, East margin of Sutter Bypass, Lundberg farms phase I field, and North wetland at Mallard Ranch.
-
Date: date of fish capture, Month / Day / Year format.
-
Wt: wet weight (g) of the individual fish at capture.
-
Type: denotes what habitat type the fish was experimentally enclosed in (Agriculture, Channel, or Wetland). Wild fish are simply denoted as “Wild”.
-
Region: location of where the fish was captured. Consisent with “location” in Exog_intercept_20210430.csv.
-
Year: denotes an individual fish’s collection year (though, each individual is also consistent with Water Year).
-
Fish_ID: unique identfier assigned to individual fish. Consistent with lab_id in Exog_intercept_20210430.csv.
-
backinc: how many increments, or days, until the fish reaches its capture date. Also known as the back-calculated increments.
-
backdate: calculated from the “backinc” and “Date” variables. Denotes the date, Month / Day / Year, in which the row’s increment was created.
-
Edge: distance (um) from otolith’s most posterior, dorsal primordia to its edge. Also referred to as the otolith radius.
-
Name: the specific location where a fish was experimentally enclosed, or where a wild fish was captured.
-
Report_type: similar to “Type” variable; additionally specifies Channel type. Options include Agriculture, Canal channel, River channel, Wetland, or NA when data unavailable. Please note, Report Type LBC2 is Canal Channel, though its Type variable is Agriculture.
-
Report_ID: denotes ID’s of different locations. Each consists of a three-letter identifier and one-digit number.
-
lat: latitude of enclosure, or of wild-caught fish’s capture.
-
long: longitude of enclosure, or of wild-caught fish’s capture.
-
FL_interim: denotes the real, measured fork length (mm) for experimentally enclosed fish. Four total data points were observed, with the last being the fork length at capture.
R Scripts
01_data_prep.R
Section of code that loads in the two datasets (exog_intercept_20210430.csv & increments.csv), and creates addition files in .csv format (otofish, sample_overview, study_overview, oto_flcomp). Likewise removes fishes that were removed from analysis, or are part of other studies outside the scope of this paper. Requires the tidyverse, RFishBC, here, and stats packages.
02_oto_model_comparison.R
Section of code that compares the Dahl-Lea, Fraser Lee, Biological Intercept, and the Modified Fry models. Incorporates the caged fish data and builds the linear mixed effects models to determine which model performs best. Requires the tidyverse, lme4, modelsummary, broom.mixed, and mlmhelpr packages.
03_oto_model_application.R
Section of code that applies best fit model to wild fish otoliths. Creates a heat map of the estimated growth rates. Requires the tidyverse, viridis, and ggforce packages.
Tables and figures
Figure 1: Overview of the study region in California’s Central Valley, USA. The study site consisted of locations in the Sutter Bypass and Butte Creek. Cage locations are shown as blue squares with black centers and wild fish sampling sites are shown as yellow crosses.
Figure 2. Otolith radius to fork-length relationship for Chinook Salmon from the California Central Valley. Gray data points are from the otolith database at the Center for Watershed Sciences, UC Davis.
Figure 3. Comparison of reconstructed growth rate trajectories from the DAL, F-L, BI and MF models with repeatedly measured fish sizes (n=4, black dots) from the enclosure experiment. Each graph represents an individual, differentiated by fish ID (at the top).
Figure 4. Observed enclosure fish lengths vs predicted fork length for the repeated measures experiment contrasting the four different otolith back-calculation models (Dahl-Lea, Fraser Lee, Biological Intercept, Modified Fry). Dashed line represents the 1:1 line between predicted and observed growth. Sample point 1 corresponds to the start of the experiment, and Sample point 3 the last sampling point for which fish size was calculated. The last sampling point (Sample point 4) is when collection occurred, in which model estimates are equal to observational size, and is thus not shown.
Figure 5. Percent error (predicted growth - observed growth rates) for fish in the enclosure experiment. Each circle represents an individual’s mean growth rate from exogenous feeding to lethal sampling/end of experiment.
Figure 6. Modified Fry (MF) reconstructed daily growth for individual wild-caught juvenile Chinook Salmon by year and capture location. Growth rates farthest to the right for each individual reflect the most recent growth at time and location of capture.
Table 1. Sample collection information for caged and wild-caught fish. Includes region in which a sample was collected, the water year in which it was collected, the type of enclosure (Agriculture, Channel, or Wetland) or if it was wild-caught (Wild), location of enclosure (Latitiude, Longitude) or NA denoting a wild-caught fish, and the number of fish used in this research (n).
Table 2. Variables from the fork length conversion equations, and subsequent values used for analysis.
Table 3. Results from the linear mixed-effects models comparing longitudinal field fish size observations with the four otolith back-calculation models. Different criteria are included to show how depending on metric, the model results may change. Note that AIC, BIC, and log likelihood are very similar tests.
Code/Software
This code was created and run through RStudio (version 4.4.2). Back-calculation of otolith increments was performed by the RFishBC R package, version 0.2.4.9000.