Visual and genetic stock identification of a test fishery to forecast Columbia River spring chinook salmon stocks 2 weeks into the future
Data files
Mar 01, 2024 version files 47.88 MB
Abstract
Modern fisheries management strives to balance opposing goals of protection for weak stocks and opportunity for harvesting healthy stocks. Test fisheries can aid management of anadromous fishes if they can forecast the strength and timing of an annual run with adequate time to allow fisheries planning. Integration of genetic stock identification (GSI) can further maximize utility of test fisheries by resolving run forecasts into weak- and healthy-stock subcomponents. Using five years (2017 – 2022) of Test Fishery data, our study evaluated accuracy, resolution, and lead time of predictions for stock-specific run timing and abundance of Columbia River Spring Chinook Salmon (Oncorhynchus tshawytscha). We determined if this Test Fishery 1) could use visual stock identification (VSI) to forecast at the coarse stock resolution (i.e., classification of “lower” versus “upriver” stocks) upon which current management is based, and 2) could be enhanced with GSI to forecast at higher stock resolution. VSI accurately identified coarse stocks (83.3% GSI concordance), and estimated a proxy for abundance (catch per unit effort, CPUE) of the upriver stock in the Test Fishery that was correlated (R2 = 0.90) with Spring Chinook Salmon abundance at Bonneville Dam (Rkm 235). Salmon travel rates (~8.6 Rkm/day) provided predictions with two-week lead time prior to dam passage. Importantly, GSI resolved this predictive ability as finely as the hatchery broodstock level. Lower river stock CPUE in the Test Fishery was correlated with abundance at Willamette Falls (Rkm 196, R2 = 0.62) but could not be as finely resolved as achieved for upriver stocks. We described steps to combine VSI and GSI to provide timely in-season information and with prediction accuracy of ~12.4 mean absolute percentage error and high stock resolution to help plan Columbia River mainstem fisheries.
README: Visual and genetic stock identification of a test fishery to forecast Columbia River spring Chinook salmon stocks 2 weeks into the future
https://doi.org/10.5061/dryad.xwdbrv1md
This is a data file containing the individual metadata and genotypic data required to analyze the test fishery mixtures from 2017 - 2022 and the mixture data from the Adult Fish Facility at Bonneville Dam for the same time series. The genotypic data were used to perform parentage and genetic stock identification analyses, and the metadata are needed to recreate the weekly strata used in the abundance estimation.
Description of the data and file structure
This is an excel file containing three tabs: "BONAFF_IndData", "TestFishery_IndData", and "Chinook sorted_genotypes". The "BONAFF_IndData" is the individual metadata for Chinook salmon collected at Bonneville Dam. The 18 field headings include "Order" (unique number to sort the individuals to correspond with the genotypes), "Rear" (H, HNC, W corresponds with clipped and unclipped hatchery fish and natural-origin as described in methods), Fork length in mm, Adult size categories based on fork length as defined in methods, statistical "Week", hatchery broodstock "GenParentHatchery" assigned by PBT, broodyear "BY", expected GSI reporting group "GenStock_exp", observed GSI reporting group "GenStock_obs", Genetic sex from sex marker genotype, AdClip = AD for adipose clipped and AI for adipose-intact, Physical Tag "PhysTag" indicates other visual marks aside from adipose clips, Management "Period", "Method" contains categories GSI, PBT, failed, or duplicate to indicate whether a fish was successfully genotyped (either GSI assigned or PBT assigned) or failed to genotype for >90% of all loci and whether it represents a unique genotype, and finally "WeekNumber" is the category of week within management period to stratify the sample. The "TestFishery_IndData" tab has similar field headings as the Bonneville sample but also includes "Visual Stock" (Lower or Upriver). Weeknumber in the Test Fishery data codes AD and AI fish and Lower and Upriver VSI fish into different categories of statistical weeks for stratifying the sample. "NA" indicates the data are not applicable and were used in cases when a sample failed to genotype (shown in "Method) which excludes the individual from analysis, or in cases where an individual was unassigned with PBT and the information that would normally be available with a hatchery fish is not available (e.g., broodyear "BY").
The "Chinook sorted_genotypes" has individual genotypes for 343 genetic markers (bi-allelic SNPs) in rows of data corresponding to each unique individual listed in the metadata tabs. "Order" and "CollectionName" and "Individual Name" are the same as those listed in the metadata tabs. Each SNP locus has two alleles listed as "-A1" and "-A2" suffixes. These data can be analyzed using the methods in the publication.
Methods
Tissue samples were dried on Whatman filter paper, and DNA was extracted using the same methods described by Hess et al. (2013) before applying protocols for genotyping-in-thousands by sequencing (GT-seq) custom amplicon methods (Campbell et al. 2015) on an Illumina sequencer. The primers for all GT-seq loci were published previously and publicly available (Koch et al. 2019). Genotypes of all individuals were organized using the R package EFGLmh (https://github.com/delomast/EFGLmh/) to create input formats required for all analytical programs used in this study. A baseline of reference collections was compiled from a set of 61 reference collections that were classified into 19 reporting groups to use genetic stock identification (GSI) to assign the most likely reporting group of origin without a minimum threshold for assignment probabilities (observed genetic stock, “GenStock_obs”) using the R package, rubias (https://github.com/eriqande/rubias). “Columbia River Basin Chinook Salmon GSI baseline version 3.1” is the dataset archived on FishGen (http://www.fishgen.net/) that was formatted for rubias and pared down to a set of 176 SNP loci known to have high genotyping success. The baseline consisted of 7,081 fish across 61 reference collections and 19 reporting groups that had <10% missing data for this set of loci. This GSI baseline has been shown to provide an average of 85% correct assignment to the 19 GSI reporting groups based on leave-1-out analysis (Hasselman et al. 2017; Table S3).
Parentage-based tagging (PBT) was performed with the several different baselines of different spawn year ranges for the different years of Test Fishery and Bonneville Dam mixture sample data. PBT assignments of offspring to parent pairs (trios) were performed using the program SNPPIT (Anderson 2012) and the threshold for confident assignments was set to a log of odds (LOD) ≥ 14 which has been shown to minimize false positives and false negatives and achieve high concordance with hatchery records (Hess et al. 2016). Genotyping per locus error rates were assumed to be 0.5% which is considered conservative given the observed average error rate of 0.2% in our lab. Different baselines were required due to the change from SNP markers (92 SNPs in legacy baselines dating back to SY2008, and 254 SNPs available since SY2012 for a portion of collections but as standard for all collections by SY2015).
Works Cited
Anderson, E. C. (2012). Large-scale parentage inference with SNPs: an efficient algorithm for statistical confidence of parent pair allocations. Statistical Applications in Genetics and Molecular Biology, 11(5).
Campbell, N. R., Harmon, S. A., & Narum, S. R. (2015). Genotyping‐in‐Thousands by sequencing (GT‐seq): A cost effective SNP genotyping method based on custom amplicon sequencing. Molecular ecology resources, 15(4), 855-867.
Hasselman, D. J., S. A. Harmon, A. R. Matala, A. P. Matala, S. J.Micheletti, and S. R. Narum. 2017. Genetic assessment of Columbia River stocks, 4/1/2016–3/1/2017. Columbia River Inter-Tribal Fish Commission, Annual Report to the Bonneville Power Administration, Project 2008-907-00, Hagerman, Idaho.
Hess, J. E., N. R. Campbell, D. A. Close, M. F. Docker, and S. R. Narum 2013. Population genomics of Pacific lamprey: adaptive variation in a highly dispersive species. Molecular Ecology 22: 2898–2916.
Hess, J. E., Ackerman, M. W., Fryer, J. K., Hasselman, D. J., Steele, C. A., Stephenson, J. J., ... & Narum, S. R. (2016). Differential adult migration-timing and stock-specific abundance of steelhead in mixed stock assemblages. ICES Journal of Marine Science, 73(10), 2606-2615.
(Janowitz‐Koch, I., Rabe, C., Kinzer, R., Nelson, D., Hess, M. A., & Narum, S. R. (2019). Long‐term evaluation of fitness and demographic effects of a Chinook Salmon supplementation program. Evolutionary Applications, 12(3), 456-469.