Data from: A 16-year green sturgeon population survey: Investigating river discharge, species identification, sampling technology, survey extent, and possible spawning cyclic dominance
Data files
Jun 17, 2026 version files 106.82 MB
-
count_flow_analysis.zip
103.03 MB
-
population_size_data.csv
3.78 MB
-
README.md
3.01 KB
-
species_ids_key.txt
621 B
-
species_ids.csv
5.49 KB
Abstract
Monitoring and understanding the population structure of anadromous fishes is vital to their conservation. The southern distinct population segment of green sturgeon (Acipenser medirostris) is a threatened long-lived anadromous fish, for which we have insufficient information about fundamental aspects of population structure. To gather these data, we began monitoring the population in 2010 with annual surveys using acoustic imagery, and recently have updated the field and analytical methods. In this work we 1) examine the transition between the two methods using both Dual-frequency Identification Sonar (DIDSON) and consumer-grade side-scan sonar products, 2) use underwater videography to test methodological assumptions about sturgeon species identification, 3) confirm the spatial extent of the spawning grounds with side-scan sonar and publicly available acoustic telemetry data, and 4) update the population estimates. We find the following: there is a consistent detectability relationship between the two acoustic imagery methods, the species we are detecting is indeed green sturgeon rather than a sympatric congener, and the extent of the historical survey covers the entire spawning ground. Through our population estimate update, we find potential evidence for a 4-year spawning cyclic dominance where one spawning line is significantly larger than the others. The larger line is expected to spawn again in 2026. We believe that this information will be useful to those attempting to recover this species.
Dataset DOI: 10.5061/dryad.bk3j9kdqw
Description of the data and file structure
Files and variables
See further details in the Methods section of this dataset
File: count_flow_analysis.zip
Description: The R project of the flow vs count analysis.
This project was run in R (version 4.5.2) using R Studio (2025.09.2 Build 418)
It uses the packages:
tidyverse 2.0.0
furrr 0.3.1
minpack.lm 1.2-4
viridis 0.6.5
here 1.0.2
cder 0.3-1
The main project file is flow_analysis.Rproj
Folder: scripts
The scripts folder contains the scripts to run the analysis. Running the run_all.R script in the scripts folder will run all other scripts in the correct order.
run_all.R calls in order:
1) load_libraries_ect.R
2) read_telemetry_data.R
3) calc_coor_with_flow.R (this calls run_bays_lm.R and calc_daily_avg_gauge_data.R)
4) telem_and_flow_coor.R (this calls run_bays_lm.R and calc_daily_avg_gauge_data.R)
Folder: inputs
This folder contains the inputs for the analysis.
parameters.rds: a large set of life history parameters used in the current analysis and produced by this previous analysis: DOI: 10.7291/D10Q2M
telemetry_data.rds: the green sturgeon telemetry data from PATH https://fishdb.wfcb.ucdavis.edu/
Folder: outputs
The data in this folder is generated while the scripts run. Two files are currently in the folder. Both these files have the option to be regenerated by the sprits if it is told to access the online databases they query. As packaged the scripts will simply read these fiels rather than query the online database.
gauge_data.rds is the relevant flow data for use in comparison to fish counted in the survey.
telem_gauge_data.rds is the relevant flow data for use in comparison to fish detected by telemetry recievers.
File: species_ids_key.txt
The key for the columns and variables used in the file species_id.csv.
File: species_ids.csv
Description: The species IDs of the fish seen on underwater footage
Variables
- date: date of GoPro deployment
- file: file name
- time_stamp: Time during video where fish seen
- basis: the basis on which species ID made (see
species_ids_key.txtfor more information) - DSC: the dorsal scute count
- LSC: the lateral scute count
- VSC: the ventral scute count
- species: species ID
- notes: notes
File: population_size_data.csv
Description: The estimated counts from each detection type bot corrected and not
Variables
- year: year of survey
- run: a simulation run ID
- census: nouber of census (1st or 2nd)
- estimate: estimate of green sturgeon
- type: method and level of correction of estimate
Code/software
any csv reader
R studio 2023.06.1
R Languague 4.2
species_ids.csv: To visually reconfirm the sturgeon species in our side-scan sonar detections, we deployed underwater camera rigs at three spawning locations that historically have high numbers of spawners and appropriate flow conditions to deploy and recover the camera rigs. These cameras enabled us to see if the signal on the side-scan sonar were green or white sturgeon. The camera rigs used 25 lb. weights as a platform to mount two waterproof action cameras (GoPro, Inc.). We epoxied housings bases to the weights and attached two buoys on a 50 ft. line to the platform. We deployed the cameras four times with two cameras per deployment and a minimum soak time of 1 hour. A single researcher annotated fish detections in the video and then two researchers independently confirmed each sighting as either a green sturgeon, white sturgeon, or unclear based on a set of criteria.
count_flow_analysis.zip: To examine if flow affected the number of detected spawners, we checked for correlations between flow and both the DIDSON (2010-2021) and side-scan sonar (2020-2024) counts. We used the adjusted DIDSON numbers (see below) as it will not affect the correlation results but will make the two methods comparable in this analysis.
All analysis was conducted in R and used numerous packages (tidyverse, mcmcplots, rjags, patchwork, furrr, viridis, cedar). We downloaded publicly available water gauge data from the California Data Exchange Center (www.cdec.water.ca.gov). The Verona gauge at river kilometer 128 is the most downstream Sacramento River gauge with minimal tidal influence, thus representative of the general river conditions that green sturgeon would be encountering as they begin their upriver migration.
We only considered flow data between March 1st and May 30th to limit the analysis to the time surrounding the bulk of the spawner migration. We averaged the flow data over 1 to 30 day windows and thus produced a total of 2,700 flow values (i.e. flows over a 1 to 30 day window with the window starting between March 1st and May 30th). We then constructed a Bayesian linear model between each of these 2,700 flow values and the number of spawners detected for both DIDSON and side-scan sonar surveys. We checked for correlation using the Bayesian R2. These 2,700 models used 3 chains with 4000 adaptation steps, 4000 burn-in steps, and a thinning level of 12, and 1000 saved samples.
To further differentiate if a potential relationship with flow was due to a detectability/gear issue or an actual effect on fish behavior, we looked at the relationship between the same flow index and telemetry data. We took all the publicly available telemetry data from the Biotelemetry Autonomous and Realtime Database (now Pacific Aquatic Telemetry Hub (PATH)) database from 2006 to 2018. We filtered available data to obtain detections that were in the spawning ground (above river kilometer 322) and detections that occurred around the spawning window (March-October). We then took any year a fish was tagged or a year they returned to spawn and marked those as a year zero. We used our previously calculated distribution of spawning intervals to calculate the probability of each individual spawning the following years after being tagged or spawning. Summing the individual fish probabilities of spawning each year gave us an expectation value of tagged spawners for each year. In other words, we calculated the most likely value for the number of tagged fish returning to spawn that year. We assumed a 10 year tag life, which is the most common tag lifespan for tracking this species. We then took the number of adults detected spawning each year and divided it by that year’s expectation value. We called this quantity the “Fish Return Index”. We then checked to see if this index was correlated with the same flow index we used for the DIDSON and side-scan sonar data.
population_size_data.csv: Previous publications have reported on and described the DIDSON based green sturgeon survey methods. We will not cover those here, but instead detail the new methods using the side-scan sonar. The survey takes place over four to five consecutive days in the month of May, June, or July.
Over the sequential survey days, we scanned all potential green sturgeon spawning pools (> 5 m deep) within the spawning grounds making three passes with a transom mounted side-scan sonar (Humminbird brand) operating at 1.2 MHz. This unit can cover the extent of the spawning pools in a single pass, however, it has an 8° blind spot directly below the transducer. We conducted passes in a downstream direction at a speed between 7.4 to 14.8 km/h (4 to 8 knots) using a 6 m aluminum jet boat suitable for shallow rivers. If evidence of more than five fish existed, we conducted two additional passes to provide better statistical results. As sturgeon are known to not congregate less frequently over boulder or bedrock substrate, and it would be difficult to differentiate them from the background with side-scan sonar, we only scanned these pools once to confirm unsuitable substrate (n = 3). In 2023 and 2024 we conducted a second survey, separated from the first by 4 weeks, to attempt in future to account for variations in how detectability, environmental conditions, and spawn timing interact (e.g., sturgeon may have left, not yet come, or be transiting to a spawning pool). The images from each pass were stitched together using SonarTRX and AutoIT into georeferenced images and manually counted in QGIS. This resulted in data for each pool with repeated counts (one count per pass).
We used an a n-mixture model in a Bayesian framework which fitted the data to a negative-binomial distribution and used a beta distribution to estimate detectability:
α~normal(2,1)
β~normal(2,1)
λ~uniform(1,300)
δ~uniform(0.1,20)
Β_δ=1/δ
Β_μ=Β_δ/((Β_δ+λ) )
N_i~nbinomial(Β_μ, Β_δ )
p_(i,j)~beta(α, β)
n_(i,j)~binomial(p_(i,j), N_i )
Here α and β are the two parameters for the beta distribution for the detectability per pass pi,j , λ and δ are parameters used in an alternative parametrization for the negative binomial distribution in jags, Bμ and Bδ (dispersion parameter) are the parameters for the negative binomial for the number of sturgeon in each pool Ni, and the number of sturgeon detected each pass ni,j is a binomial distribution with pi,j as the probability and Ni as the size.
We ran three chains, with 5000 adaptation and burn in steps, and kept 1000 samples with a thinning of 250. We checked convergence graphically and checked performance with a graphical post predictive check. We also graphically ensure the posterior distributions were not simply conforming to priors. To avoid a potentially flat solving surface and confounded parameters for some years, we used the time-for-space method where all the data across all years was analyzed in the same model with the same site in different years treated as different sites.
For two years (2020 and 2021) we were able to run the side-scan sonar and DIDSON system concurrently during the survey (the DIDSON broke and was unrepairable after 2021). From initial observations, we noted that the side-scan sonar’s wider field of view allowed it to observe more fish than the DIDSON. We checked the agreement between the DIDSON and side-scan sonar methods for those two years to see if there was a consistent ratio between the two methods, and which we could use to adjust the previous DIDSON counts.
We then used telemetry data to estimate the fraction of the fish that entered the spawning grounds in that year which had already left or had not yet arrived within the spatial extent of our survey during the sampling event. We combined this value with our estimate of the number of observed sturgeon to get an estimate of total spawner abundance for that year.
Unfortunately, the telemetry database is based on scientists voluntary self-reporting of tag data, thus the complete detection data are often several years behind. Thus, doing a correction based on the current year’s (or recent past years) was not feasible at the time of analysis. In addition, several years often had very low numbers of tagged fish detected in the river (as low as 9). These low numbers can result in large fluctuations in estimates if only a single fish enters or leaves the spawning grounds the week of the survey. Finally, previous unpublished work has found little ability to predict the fraction of spawners still in the spawning grounds based on environmental covariates. Thus, we adopted a method to use the average number of spawners present in the spawning grounds for each day of the year, rather than attempt to use a year specific correction. We again used the same telemetry data using detections in the spawning ground (above river kilometer 322) during the spawning window (March to October). We binned the detections in a window of four days (the time it takes to do a complete survey). We then used the first and last day each tag was above river kilometer 322 as the time window that the spawner was in the spawning ground. We divided the number of fish in the system in the 4-day bin by the total number of fish that entered the spawning grounds at any point that year. We took the fraction of fish still in the spawning ground each day and averaged across each year weighting by the total number of spawners detected that year. Based on the time of year we conducted the survey we then divided the number we detected in our survey by the fraction present to get the total number of spawners each year. For completeness we ran these new spawner numbers through our previously published model to estimate the total number of adults and the total population.
