Reproductive contribution of lake sturgeon transferred upstream of dams on a Great Lakes tributary
Data files
Dec 31, 2024 version files 132.57 MB
-
Menominee_pedigree.zip
132.55 MB
-
README.md
15.66 KB
Abstract
Dam construction contributes to declines in the abundance and distribution of many fishes. Increasing connectivity through adult transfer can be demographically/genetically beneficial but assessing the effects resulting from transfer can be difficult if resident fish exist upstream. Genotypes of adult/larval lake sturgeon (Acipenser fulvescens) were used to quantify contributions to larval recruitment from adults transferred upstream of dams on the Menominee River, USA. We evaluated whether transfer timing, sex, and adult size were associated with the odds of reproduction. Elevator transfer operations in Fall 2019/2020, and Spring 2021 resulted in 152 male and 81 female lake sturgeon transferred upstream. In 2020 and 2021, 580 and 518 larvae were genotyped. 86% (201/233) of adults reproduced and 62.3% (684/1098) of offspring had transferred parents. 392 resident adults contributed to offspring production. Mixed matings accounted for 53% of offspring genotyped, increasing levels of offspring genetic diversity relative to offspring produced from resident-only matings. Transferring adults may be a viable restoration alternative for other iteroparous fish in river systems where connectivity to spawning areas has been impeded.
README
README for: Forsythe et al. 2024. Reproductive contribution of lake sturgeon transferred upstream of dams on a Great Lakes tributary. Canadian Journal of Fisheries and Aquatic Sciences
This repository contains the data and code used in the above study which investigated the effects of transferring adult lake sturgeon upstream of dams on the Menominee River, USA.
Overview of the Study
This study investigated the effects of transferring adult lake sturgeon upstream of dams on the Menominee River, USA, to mitigate the negative impacts of dam construction on fish populations. The study aimed to determine if transferring adult lake sturgeon upstream of dams can increase connectivity and improve the genetic diversity of the fish population. The researchers used genotypes of adult and larval lake sturgeon to infer the contributions of transferred adults to larval recruitment. The study also examined the effects of transfer timing, sex, and adult size on reproductive success. The study found that the majority of transferred adults reproduced and contributed significantly to the next generation of lake sturgeon. The results suggest that transferring adults may be an effective restoration strategy for other iteroparous fish species in river systems where dam construction has disrupted connectivity to spawning areas.
Description of the Folders
Each subfolder (analysis) contains an "Input" folder with all the data required for the R scripts in that directory.
All R scripts include a detailed "About" section describing their purpose. All figures and analytical output were written to the subfolder's associated "Output" folder.
Software Requirements for analyses described below: R (version 4.1.2 or later) and the following R packages were used: tidyverse, readxl, stringr
Subfolder 1) Actual Data Analysis: This folder contains the data and scripts used to analyze the biological characteristics of the lake sturgeon population and assess reproductive success.
R scripts:
fitness evaluation both years.R: This script evaluates the fitness of the lake sturgeon population over both years of the study, considering factors associated with reproductive success.
Mate pair randomization tests and stats with NoS19.R: This script conducts mate pair randomization tests and calculates statistics related to mate pair types.
Input folder file:
ALL ADULT DATA sex and biodata.xlsx: This file contains biological data for adult lake sturgeon, including:
sample.id: Unique identifier for each individual
sex: M = male, F = female
collection: Date of specimen collection (MM/DD/YY)
release: Date of specimen release upstream (MM/DD/YY)
reintro.year: simplified cohort description to connect to offspring collected in subsequent year
rs.20: To be filled with counts of offspring assignments in 2020
rs.21: To be filled with counts of offspring assignments in 2021
ms.20: To be filled with counts of mates in 2020, based on reconstructed pedigree
ms.21: To be filled with counts of mates in 2021, based on reconstructed pedigree
tl (mm): Total length in inches
girth (mm): Thorax girth in inches
season: Season specimen was transfered about the dam
Output folder files:
All adults bio data.txt: This file contains the same biological data as "ALL ADULT DATA sex and biodata.xlsx" but in a plain text format. The "rs" and "ms" columns are filled in with non-NAs integers that represent the counts of offspring and mates assigned to each parent in each cohort (2020 and 2021). In addition, tl has been converted in millimeters. suc.20 and suc.21 were used in the logistic regression analysis to identify if a given adult successfully spawned based on if they were assigned to at least one offspring (1) or not (0).
relyear and season2 were just columns describing what year the fish was transferred and what season and year (e.g., S19 - Spring 19). Both relyear and season2 information is represented in other columns as well.
mp randomization tests 2019.txt: This file contains the results of mate pair (mp) randomization tests conducted for 2020 pedigree. These tests assessed whether the observed count of mate pair types differs significantly from a random distribution.
mp randomization tests 2021 NoS19.txt: This file contains the results of mate pair randomization tests conducted for the 2021 reconstructed pedigree.
mate pair breakdown both years NoS19.tiff: This file shows a breakdown of counts of offspring for each mate pair type (resident-resident, resident-migratory, migratory-migratory) for both years of the study, along with statistical significant based on the randomization procedure.
repeat spawners.txt: This file lists the lake sturgeon that spawned multiple times during the study, along with their spawning years and other relevant information.
reproductive success table.txt: This file contains summary statistics describing the reproductive success of the lake sturgeon population.
successful spawning table.txt: This file provides details on successful spawning events, similar to the reproductive success table described above.
Subfolder 2) Relatedness Randomization: This folder contains the data and scripts used to randomize the relatedness of the lake sturgeon population and evaluate the impact of adult transfers on offspring genetic diversity.
R scripts:
relatedness simulations 2020.R: This script conducts the relatedness randomization tests for the lake sturgeon population in 2020.
relatedness simulations 2021.R: This script conducts the relatedness randomization tests for the lake sturgeon population in 2021.
cleaning up relatedness tables.R: This script organizes and prepares the relatedness tables for analysis.
Input folder files:
F19Adults_2020Kids2.BestConfig.txt and NoS19V1.BestConfig.txt: These files are Best Configuration pedigree reconstructions inferred by Colony, which infer which offspring were produced by which adults transferred above the dam, as well as which were produced by resident lake sturgeon above the dam.
2020 Offspring disomic loci.xlsx: This file contains disomic loci data (microsatellite markers) for the 2020 offspring, which were used to estimate rxy for all possible offspring dyads in 2020. Each row represents an individual, and each column represents an allele for a specific locus.
2021 Offspring disomic loci.xlsx: This file contains disomic loci data (microsatellite markers) for the 2021 offspring, which were used to estimate rxy for all possible offspring dyads in 2020. Data are structured the same as the 2020 offspring file.
Output folder files:
2020 offspring pairwise relatedness values.txt: This file contains pairwise relatedness values for the 2020 offspring,
which were inferred based on disomic loci in the program Coancestry using the trio-ML method.
2021 offspring pairwise relatedness values.txt: This file contains pairwise relatedness values for the 2021 offspring.
output summary 2020.txt: This file summarizes the 2020 simulation output that tested is the ratio of the median rxy the two groups were expected by chance alone. See supplemental file for more details.
output summary 2020 ordered.txt: This file summarizes the 2020 simulation output, ordered by the pairs of parental categories being compared.
output summary 2021.txt: This file summarizes the 2021 simulation output, similar to the 2020 summary file.
output summary 2021 ordered.txt: This file summarizes the 2021 simulation output, ordered by the pairs of parental categories being compared.
Randomization tables.xlsx: This file contains the randomization tables used in the study to summarize median ratios and statistical significance for both cohorts of offspring produced.
Subfolder 3) Simulating Colony Dat Files: This folder contains the data and scripts used to simulate colony dat files, which are used for pedigree reconstructions.
R script:
colony inputPar simulations RESrandDRAW.R: This script simulates the colony input parameters for the RESrandDRAW simulation, which is used to reconstruct pedigrees using genotypic information.
Input folder files
2020 Menominee R Af genotyping data 121721_FINAL.xlsx: This file contains genotyping data both disomic and polysomic microsatellite markers for the Menominee River adult lake sturgeon population. Each row represents an individual, and each column represents an allele for a specific locus.
Output folder files
Scenarios representing when 25%, 50%, or 75% of transferred adults successfully reproduced. Read more about the simulations in the "About" part of the associated R script. Importantly, the Output folder has three subfolders associated with these three different simulated scenarios: sims_adults25successful, sims_adults50successful, and sims_adults75successful. Within each, there are three additional folders: breeding matrices, colony_dat, and input_par - these three folders contain output files, of the same name, for each simulation.
Also, a .txt file contains summary statistics associated with each simulation for each scenario - again, see the R script's ABOUT section (and below) for more details. The four remaining folders are empty and were used for copy/pasting the subfolder structure multiple times.
Subfolder 4) Simulation Analysis: This folder contains the data and scripts used to analyze the simulations, assess the accuracy of parentage assignments, and evaluate the performance of different simulation scenarios.
R scripts:
Step 1 - sim.analysis.Bestconfig.summaries.R: This script performs the first step in the simulation analysis, focusing on summarizing results from the "best configuration" pedigrees inferred by Colony for each simulation.
Step 2 - sim.analysis.adding.confit.output.summaries.R: This script integrates results from other configuration output summaries with all possible assignment rates between offspring, providing a more comprehensive view of the simulation results.
Step 3 - sim.analysis.graphics.pubs.R: This script generates the graphics (plots and visualizations) used in the publication.
Step 4 - expected.false.assignments.R: This script calculates or estimates the expected number of false parentage assignments in the simulations, helping to assess the accuracy of the parentage analysis.
Input folder files AND folders:
sim.info_mom0.25_dad0.25.txt: This file contains information about a simulation with a 25% successful spawning rate for both females and males. This information was used in the Simulation analysis to compare true number of transferred adults and resident adults that produced offspring used in pedigree constructued to the number inferred by Colony. Information in the columns are as follows:
n.par: The number of parents in the simulation.
n.mom: The number of mothers in the simulation.
n.dad: The number of fathers in the simulation.
sex.ratio: The sex ratio of the parents (dads/moms).
noff: The number of offspring in the breeding matrix.
female.rs: The mean number of offspring produced per female.
females.mates: The mean number of mates per female.
male.rs: The mean number of offspring produced per male.
male.mates: The mean number of mates per male.
mp.count: The total number of mate pairs.
mp.rs: The mean number of offspring produced per mate pair.
type: Whether the breeding matrix "Before" any subsampling happened or "After" the 580 offspring were sampled randomly.
lost: The number successful mate pairs where none of there offspring were sampled in the 580 collected.
sim: The simulation replicate number.
tot.mom: The total number of potential transferred mothers (including unsampled).
tot.dad: The total number of potential transferred fathers (including unsampled).
mom.prob: The probability that a transfered mother was successful at reproduction.
dad.prob: The probability that a transfered mother was successful at reproduction.
unsampled: The number of unsampled parents after randomly sampling 580 offspring.
res.mom: The number of resident mothers in the original breeding matrix.
res.dad: The number of resident fathers in the original breeding matrix.
sim.info_mom0.5_dad0.5.txt: This file contains similar information to the previous file but for a simulation with a 50%
successful spawning rate for both females and males.
sim.info_mom0.75_dad0.75.txt: This file contains similar information to the previous files but for a simulation with a 75%
successful spawning rate for both females and males.
BestConfigs subfolder: This folder contains the "best configuration" pedigrees inferred by COLONY software for each simulated dataset in each simulation scenario.
Confit.summaries subfolder: This folder contains text files that summarize confusion matrices associated with each simulations
BestConfiguation file inferred by COLONY. The number of known (real.type) full-siblings (FS), half-siblings (HS), and unrelated (UR) offspring dyads. The FS, HS, and UR columns are what were inferred by COLONY. The real.type column rows represent known FS, HS, and UR dyads. The counts are summed rowwise in the total column. And pFS, pHS, and pUR. Are just the proportions of counts divided by the total column.
Output.summaries subfolder: This folder contains summaries statistics based on the best configuration pedigree (estimated - est) inferred by COLONY and what was known based by simulation information embedded in offspring names (real) for the number of mate pairs (mp), the number of fullsiblins (nFS), number of maternal and paternal halfsiblings (nMHS and n PHS). The number of parents, as well as the mean and variance in reproductive success (counts of offspring assigned to each parent in the pedigree).
Output folder files:
confusion.matrix.jpeg: This file shows a confusion matrix that visualizes the accuracy of parentage assignments in the simulations.
confusion.matrix.pub.tiff: This file contains the same confusion matrix as the JPEG version but in a higher-resolution format suitable for publication.
expected_migartory_parent_distributions_bothsexes.jpeg: This file shows the expected distributions of migratory parents (both sexes) in the simulations.
expected_migartory_parent_distributions_bothsexes.pub.tiff: This file contains the same distributions as the JPEG version but in a higher-resolution format suitable for publication.
inferred2known_ratio_bothsexes.jpeg: This file visualizes the ratio of inferred to known parents for both sexes in the simulations.
inferred2known_ratio_bothsexes.pub.tiff: This file contains the same visualization as the JPEG version but in a higher-resolution format suitable for publication.
inferred2known_relationship_bothsexes.jpeg: This file visualizes the relationship between inferred and known parents for both sexes.
inferred2knownresidents_dads.jpeg: This file visualizes the relationship between inferred and known resident fathers.
inferred2knownresidents_moms.jpeg: This file visualizes the relationship between inferred and known resident mothers.
inferred2knownresidents_parents.pub.tiff: This file contains a visualization of the relationship between inferred and known resident parents in a higher-resolution format suitable for publication.
simulation.bestconfig.info.txt: This file contains information specifically from the simulations using the "best configuration" pedigree and is effectively an intermediate file created before adding in the confit and output.summary information for each simulation.
simulation.all.info.txt: This file contains comprehensive information from all the simulations, including the number of offspring,the number of parents, the proportion of migratory parents, and the accuracy of parentage assignments.
Methods
Adult lake sturgeon for upstream transfer were collected from the section of the Menominee River below Menominee Dam and were released above Park Mill Dam into USF (Fig. 1). Collection of larval sturgeon for parentage assignment occurred in the upper portion of USF, just below Grand Rapids Dam.Standard D-frame drift nets were used to collect larval lake sturgeon dispersing from spawning grounds in 2020 and 2021 (Smith and King 2005; Tucker et al. 2021). Genomic DNA was extracted from fin clips of all adult and larval tissue using the DNeasy Blood and Tissue kit (QIAGEN, Germantown, MD). DNA samples were quantified using a NanoDrop 1000 spectrophotometer (Nanodrop Technologies, Wilmington, DE) and diluted to 20 ng/µl for use in PCR reactions. Individuals were genotyped at 13 disomic microsatellite loci as described in Hunter et al. (2020) and Scribner et al. (2022).