Blueback Herring Alosa aestivalis phenology: spawning run abundance, sex ratio, and female reproductive investment
Data files
Feb 10, 2026 version files 552.80 KB
-
both_cpue_dataset_for_table_6_prediction_2.csv
1.84 KB
-
dissection_data.csv
30.71 KB
-
entries_fig_5.csv
560 B
-
entries_figure_2.csv
2.02 KB
-
entries_figure_3.csv
1.72 KB
-
entries_figure_4A.csv
1.93 KB
-
entries_figure_4B.csv
1.59 KB
-
entries_for_table_1.csv
3.25 KB
-
entries_table_2_and_table_T3.csv
1.68 KB
-
entries_table_3.csv
882 B
-
entries_Table_4_and_Table_T6.csv
1.36 KB
-
entries_table_5_and_table_T8.csv
1.24 KB
-
entries_table_T2.csv
1.55 KB
-
entries_table_T4.csv
1.38 KB
-
entries_Table_T5.csv
741 B
-
entries_Table_T7.csv
757 B
-
entries_table_T9A.csv
1.25 KB
-
entries_table_T9B.csv
987 B
-
fem_cpue_dataset_for_table_6_prediction_1.csv
1.84 KB
-
fem_cpue2_dataset_for_table_6_prediction_3.csv
1.81 KB
-
femovary_cpue3_dataset_for_table_6_prediction_4.csv
1.80 KB
-
indiv_data.csv
368.96 KB
-
read_me_sas_macro_for_permutation_test.md
3.68 KB
-
README.md
44.88 KB
-
run_data.csv
14.09 KB
-
sample_size_entries_table_2_and_table_T3.csv
400 B
-
sample_size_entries_table_3.csv
224 B
-
sample_size_entries_table_T2.csv
115 B
-
sample_size_entries_table_T4.csv
112 B
-
SAS_code_Figure_2.md
1.37 KB
-
SAS_code_Figure_3.md
1.42 KB
-
SAS_code_Figure_4.md
2.05 KB
-
SAS_code_Figure_5.md
11.85 KB
-
SAS_code_for_Table_2_and_Table_T3.md
2.74 KB
-
SAS_code_for_Table_4_and_Table_T5_and_table_T6.MD
4.02 KB
-
SAS_code_for_table_6_prediction_1.md
4.08 KB
-
SAS_code_for_table_6_prediction_2.md
6.15 KB
-
SAS_code_for_table_6_prediction_3.md
4.39 KB
-
SAS_code_for_table_6_prediction_4.md
4.48 KB
-
SAS_code_for_table_T2.md
2.95 KB
-
SAS_code_for_table_T4.md
3.27 KB
-
SAS_code_for_table_T9.md
5.61 KB
-
SAS_code_table_1.md
1.12 KB
-
SAS_code_table_3.md
3.94 KB
Abstract
This dataset supports an investigation into the seasonality of Blueback Herring spawning in the Connecticut River. It includes data taken on fish specimens caught during 117 boat electrofishing surveys at four sites over four years. Each site visit yielded an estimate of catch per unit effort, a metric of abundance. The sex of individuals was determined in the field, enabling an estimate of female catch per unit effort. The field crew also assessed the reproductive status of females, enabling an estimate of the abundance of spawner females, i.e. those that were in reproductive condition. A subset of individuals collected on each date was euthanized and dissected, yielding data on ovary mass; these data enabled estimates of ovary tissue abundance. Seasonal change in each of these abundance estimates represents a depiction of the population’s reproductive phenology, its seasonal timing of spawning. These data provide a foundation for inquiries into the processes that influence variability in the fitness of offspring produced at different times in the season.
Dataset DOI: 10.5061/dryad.jdfn2z3qv
This dataset contains abundance estimates, sex ratio, length and weight data, reproductive stage, and ovary masses for Blueback Herring (Alosa aestivalis) collected in areas of the Connecticut River known to be spawning sites over springtime months of four years 2019, 2021, 2022, and 2023. The data support analyses of reproductive activity of this migratory fish. The datasets enable reproduction of all tables and figures in Schultz, E.T., Mouchlianitis, F.A., Sprankle, K., and Ganias, K. (2025). Counting eggs before they hatch: comparing metrics of spawning seasonality in Blueback Herring. Transactions of the American Fisheries Society 00, 1–12.
We have submitted three datasets (run_data.csv, indiv_data.csv, and dissection_data.csv, from which all figures and tables were derived. Other datasets comprise entries for figures and tables and are organized in subdirectories specific to each manuscript element. Code is provided for all analyses. In CSV files, missing values are represented as ‘.’ or are blank.
Description of the Data and File Structure
run_data.csv
This dataset contains information on catch during sampling events. Dataset run_data includes the, site (station), the electroshocking run number, the number of seconds of the electroshocking run (t_sec), and the number of Blueback Herring collected (n_bbh).
Variables:
- Date: Collection date in MM/DD/YY format (e.g., 5/13/2019 = May 13, 2019)
- Station: Sampling location as a three letter code (CHI: Lower Chicopee River; FAR: Lower Farmington River; WES: Lower Westfield River; WET: Wethersfield Cove)
- Run: The number of the electroshock sample taken in that station on that date, in integer values
- t_set: The elapsed electroshock time of the run
- n_bbh: The number in integer format of Blueback Herring collected in the run
Sample size: n = 611 electroshocking samples
indiv_data.csv
This dataset comprises information taken in the field on individual Blueback Herring collected via electroshocking.
Variables:
- indiv_id: Code identifying each individual in format IDmmddyyxxxzzz, in which mmddyy represents the date of collection, xxx represents the station, and zzz is the number of the individual in that sample
- Date: Collection date in DD- mmm-YY format (e.g., 17-apr-19 = April 17, 2019)
- Station: Sampling station code, recorded as a three-letter uppercase abbreviation (e.g., FAR, WET).
- TL: Total length of the individual, recorded as an integer value in millimeters.
- FL: Fork length of the individual, recorded as an integer value in millimeters.
- Weight: Body weight of the individual, recorded as an integer value in grams.
- Sex: Sex of the individual, recorded as a single lowercase letter code (m = male, f = female).
- Condition: Condition or maturity status, recorded as a single lowercase letter code (g: gravid; r: ripe/running; ps: partially spent; s: spent).
- Sacrificed: Indicator of whether the individual was sacrificed during sampling, recorded as a single uppercase letter (N: no; Y: yes).
Sample size: n = 7912 individuals
Dissection_data.csv
This dataset provides information on individuals that were euthanized and dissected.
Variables:
- indiv_ID: Code identifying each individual, matching the unique identifier used across datasets (alphanumeric code indicating collection date, station, and individual number).
- wet_wt: Whole (wet) body weight of the individual, recorded as an integer value in grams.
- ovary_wt: Ovary weight of the individual, recorded in grams.
Sample size: n = 1330 individuals
entries_figure_2.csv
This dataset provides the values that are displayed in Figure 2 of the paper, which represents the proportion and standard deviation values of Blueback Herring that are female versus calendar week, plotted by year, in Lower Region sites of the Connecticut River.
Variables:
- year: Calendar year in which sampling occurred, recorded as a four-digit integer (e.g., 2019).
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER).
- week: Week of the year in which sampling occurred, recorded as an integer value.
- prop_female: Proportion of individuals that are female in the sample, recorded as a decimal value between 0 and 1.
- sd_prop_female: Standard deviation of the proportion of females, recorded as a decimal value.
- n_indiv: Total number of individuals in the sample, recorded as an integer value.
entries_figure_3.csv
This dataset provides the values that are displayed in Figure 3 of the paper, which represents the proportion and standard deviation values of Blueback Herring females that are spawning female versus calendar week, plotted by year, in Lower Region sites of the Connecticut River.
Variables:
- year: Calendar year in which sampling occurred, recorded as a four-digit integer (e.g., 2019).
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER).
- week: Week of the year in which sampling occurred, recorded as an integer value.
- prop_spawner: Proportion of individuals classified as spawners in the sample, recorded as a decimal value between 0 and 1.
- sd_prop_spawner: Standard deviation of the proportion of spawners, recorded as a decimal value.
- n_indiv: Total number of individuals in the sample, recorded as an integer value.
entries_figure_4A.csv
This dataset provides the values that are displayed in Figure 4A of the paper, which represents the gonosomatic index mean and standard deviation values of all Blueback Herring females versus calendar week, plotted by year, in Lower Region sites of the Connecticut River.
Variables:
- year: Calendar year in which sampling occurred, recorded as a four-digit integer (e.g., 2019).
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER).
- week: Week of the year in which sampling occurred, recorded as an integer value.
- n_females: Number of female individuals included in the sample, recorded as an integer value.
- GSI: Mean gonadosomatic index of females in the sample, recorded as a decimal value.
- SD_gsi: Standard deviation of the gonadosomatic index, recorded as a decimal value.
entries_figure_4B.csv
This dataset provides the values that are displayed in Figure 4B of the paper, which represents the gonosomatic index mean and standard deviation values of spawner Blueback Herring females versus calendar week, plotted by year, in Lower Region sites of the Connecticut River.
Variables:
- year: Calendar year in which sampling occurred, recorded as a four-digit integer (e.g., 2019).
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER).
- week: Week of the year in which sampling occurred, recorded as an integer value.
- n_females: Number of spawner female individuals included in the sample, recorded as an integer value.
- GSI: Mean gonadosomatic index of spawner females in the sample, recorded as a decimal value.
- SD_gsi: Standard deviation of the gonadosomatic index, recorded as a decimal value.
entries_fig_5.csv
This dataset provides the values that are displayed in Figure 5 of the paper, which represents five phenology metrics, as mean and standard error values of spawning week, plotted by year, in Lower Region sites of the Connecticut River. X axis locations represent different metrics of spawning activity, and estimates of mean week and SE are weighted differently for each: ‘all’ by CPUE of all Blueback Herring, ‘females’ by CPUE of all females, ‘spawners’ by CPUE of females in spawning condition, ‘all ovaries’ by ovary mass of all females, and ‘spawner ovaries’ by ovary mass of females in spawning condition..
Variables:
- year: Calendar year in which sampling occurred, recorded as a four-digit integer (e.g., 2019, 2021).
- n_weeksall: Number of weeks with data available for all individuals, recorded as an integer value.
- mean_week_all: Mean week weighted by abundance of all individuals, recorded as a decimal value.
- se_date_all: Standard error of the mean week weighted by abundance of all individuals, recorded as a decimal value.
- n_weeksallfem: Number of weeks with data available for all female individuals, recorded as an integer value.
- mean_week_allfem: Mean week weighted by abundance of all female individuals, recorded as a decimal value.
- se_date_allfem: Standard error of the mean week weighted by abundance of all female individuals, recorded as a decimal value.
- n_weeks_spawnerfem: Number of weeks with data available for spawning female individuals, recorded as an integer value.
- mean_week_spawnerfem: Mean week weighted by abundance of spawning female individuals, recorded as a decimal value.
- se_date_spawnerfem: Standard error of the mean week weighted by abundance of spawning female individuals, recorded as a decimal value.
- n_weeksallovary: Number of weeks with ovary data available for all females, recorded as an integer value.
- mean_week_allovary: Mean week of occurrence weighted by ovary data for all females, recorded as a decimal value.
- se_date_allovary: Standard error of the mean week of occurrence weighted by ovary data for all females, recorded as a decimal value.
- n_weeksovary: Number of weeks with ovary data available for spawning females, recorded as an integer value.
- mean_week_spawnerovary: Mean week of occurrence weighted by ovary data for spawning females, recorded as a decimal value.
- se_date_spawnerovary: Standard error of the mean week of occurrence weighted by ovary data for spawning females, recorded as a decimal value.
entries_for_table_1.csv
This dataset provides the values that are displayed in Table 1 of the paper, which represents weekly abundance of adult Blueback Herring at four sites in the Connecticut River over four years.
Variables:
- year: Calendar year in which sampling occurred, recorded as a four-digit integer (e.g., 2019).
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER).
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- week: Week of the year in which sampling occurred, recorded as an integer value.
- cpue: Catch per unit effort for the given station and week, recorded as a decimal value.
- sum_bbh: Total count of individuals captured for the given station and week, recorded as an integer value.
entries_table_2_and_table_T3.csv
This dataset provides the values that are displayed in Table 2 and Supplementary Table T3 of the paper, which represent temporal change in sex ratio among adult Blueback Herring in four sites of the Connecticut River over four years.
Variables:
- year: Calendar year associated with the model results, recorded as a four-digit integer (e.g., 2019).
- region: Sampling region to which the model applies, recorded as an uppercase text label (values LOWER and UPPER).
- Station: Sampling station code for which the model was fit, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- Parameter: Model parameter name, recorded as text (values Intercept for the regression intercept, week for the slope).
- Estimate: Estimated coefficient value for the model parameter, recorded as a numeric value.
- Standard Error: Standard error of the parameter estimate, recorded as a decimal value.
- Wald Chi-Square: Wald chi-square test statistic for the parameter, recorded as a decimal value.
- Pr > ChiSq: P-value associated with the Wald chi-square test, recorded as a decimal value or threshold notation (e.g., <.0001).
sample_size_entries_table_2_and_table_T3.csv
This dataset provides additional values that are displayed in Table 2 and Supplementary Table T3 of the paper, which represent temporal change in sex ratio among adult Blueback Herring in four sites of the Connecticut River over four years.
Variables:
- year: Calendar year in which sampling or analysis occurred, recorded as a four-digit integer (e.g., 2019).
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER).
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- N(weeks): Number of weeks included in the dataset for the given station and year, recorded as an integer value.
- N(observations): Total number of observations collected for the given station and year, recorded as an integer value.
entries_table_3.csv
This dataset provides values that are displayed in Table 3 of the paper, which represents temporal change in reproductive status among female Blueback Herring in two sites of the Connecticut River over four years.
Variables:
- year: Calendar year associated with the model results, recorded as a four-digit integer (e.g., 2019).
- region: Sampling region to which the model applies, recorded as an uppercase text label (values LOWER and UPPER).
- Station: Sampling station code for which the model was fit, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- Parameter: Model parameter name, recorded as text (values Intercept for the regression intercept, week for the slope).
- Estimate: Estimated coefficient value for the model parameter, recorded as a numeric value that may be positive or negative.
- Standard Error: Standard error of the parameter estimate, recorded as a decimal value.
- Wald Chi-Square: Wald chi-square test statistic for the parameter, recorded as a decimal value.
- Pr > ChiSq: P-value associated with the Wald chi-square test, recorded as a decimal value or threshold notation (e.g., <.0001).
sample_size_entries_table_3.csv
This dataset provides additional values that are displayed in Table 3 of the paper, which represents temporal change in reproductive status among female Blueback Herring in two sites of the Connecticut River over four years.
Variables:
- year: Calendar year in which sampling or analysis occurred, recorded as a four-digit integer (e.g., 2019).
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER).
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- N(weeks): Number of weeks included in the dataset for the given station and year, recorded as an integer value.
- N(observations): Total number of observations collected for the given station and year, recorded as an integer value.
entries_Table_4_and_Table_T6.csv
This dataset provides values that are displayed in Table 4 and Supplementary Table T6 of the paper, which represent temporal change in reproductive investment among female Blueback Herring in four sites of the Connecticut River over four years.
Variables:
- year: Calendar year in which the analysis was conducted, recorded as a four-digit integer (e.g., 2019).
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER).
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- Intercept: Estimated intercept value from the statistical model, recorded as a numeric value.
- seIntercept: Standard error of the intercept estimate, recorded as a decimal value.
- tIntercept: t-statistic associated with the intercept estimate, recorded as a numeric value.
- pIntercept: P-value associated with the intercept t-statistic, recorded as a decimal value.
- week: Estimated slope (effect) of week from the statistical model, recorded as a numeric value.
- seweek: Standard error of the week effect estimate, recorded as a decimal value.
- tweek: t-statistic associated with the week effect estimate, recorded as a numeric value.
- pweek: P-value associated with the week effect t-statistic, recorded as a decimal value.
- nweeks: Number of weeks included in the analysis for the given station and year, recorded as an integer value.
- nfemales: Total number of female individuals included in the analysis, recorded as an integer value.
entries_Table_T5.csv
This dataset provides values that are displayed in Supplementary Table T5 of the paper, which provides results of statistical modeling of temporal change in reproductive investment among female Blueback Herring in four sites of the Connecticut River over four years..
Variables:
- region: Sampling region for the analysis, recorded as an uppercase text label (values LOWER and UPPER).
- Station: Sampling station code where the data were collected, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- NAME: Name of the response variable used in the analysis of variance, recorded as text (e.g., GSI).
- SOURCE: Source of variation in the statistical model, recorded as text (e.g., ERROR, week, year, week*year).
- TYPE: Type of sum of squares reported for the effect, recorded as text (e.g., ERROR, SS3).
- DF: Degrees of freedom associated with the source of variation, recorded as an integer value.
- SS: Sum of squares associated with the source of variation, recorded as a decimal value.
- F: F-statistic for testing the effect, recorded as a decimal value; missing for error terms.
- PROB: P-value associated with the F-statistic, recorded as a decimal value or missing indicator.
entries_table_5_and_table_T8.csv
This dataset provides values that are displayed in Table 5 and Supplementary Table T8 of the paper, which represent temporal change in reproductive investment among spawning Blueback Herring in four sites of the Connecticut River over four years.
Variables:
- year: Calendar year in which the analysis was conducted, recorded as a four-digit integer (e.g., 2019–2023).
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER, UPPER).
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- Intercept: Estimated intercept value from the statistical model, recorded as a numeric value; missing values (.) indicate the model was not fit due to insufficient data.
- seIntercept: Standard error of the intercept estimate, recorded as a decimal value; missing values (.) indicate the model was not fit.
- tIntercept: t-statistic associated with the intercept estimate, recorded as a numeric value; missing values (.) indicate the model was not fit.
- pIntercept: P-value associated with the intercept t-statistic, recorded as a decimal value; missing values (.) indicate the model was not fit.
- week: Estimated slope (effect) of week from the statistical model, recorded as a numeric value; missing values (.) indicate the model was not fit.
- seweek: Standard error of the week effect estimate, recorded as a decimal value; missing values (.) indicate the model was not fit.
- tweek: t-statistic associated with the week effect estimate, recorded as a numeric value; missing values (.) indicate the model was not fit.
- pweek: P-value associated with the week effect t-statistic, recorded as a decimal value; missing values (.) indicate the model was not fit.
- nweeks: Number of weeks included in the analysis for the given station, region, and year, recorded as an integer value.
- nspawners: Total number of spawning individuals included in the analysis, recorded as an integer value.
entries_Table_T7.csv
This dataset provides values that are displayed in Supplementary Table T7 of the paper, which provides results of statistical modeling of temporal change in reproductive investment among spawning Blueback Herring in four sites of the Connecticut River over four years..
Variables:
- region: Sampling region for the analysis, recorded as an uppercase text label (values LOWER and UPPER, UPPER).
- Station: Sampling station code where the data were collected, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- NAME: Name of the response variable used in the analysis of variance, recorded as text (e.g., GSI).
- SOURCE: Source of variation in the statistical model, recorded as text (e.g., ERROR, week, year, week*year).
- TYPE: Type of sum of squares reported for the effect, recorded as text (e.g., ERROR, SS3).
- DF: Degrees of freedom associated with the source of variation, recorded as an integer value.
- SS: Sum of squares associated with the source of variation, recorded as a decimal value.
- F: F-statistic for testing the effect, recorded as a decimal value; missing values (.) indicate error terms.
- PROB: P-value associated with the F-statistic, recorded as a decimal value or missing indicator (.).
fem_cpue_dataset_for_table_6_prediction_1.csv
This dataset provides source data that are subjected to permutation resampling designed to test prediction 1 in Table 6 of the paper, that mean spawning date based on the abundance of all individuals is earlier than mean spawning date based on the abundance of females only.
Variables:
- year: Calendar year in which sampling occurred, recorded as a four-digit integer (e.g., 2019).
- week: Week of the year in which sampling occurred, recorded as an integer value.
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- cpue: Catch per unit effort for all individuals at the given station and week, recorded as a decimal value.
- fem_cpue: Catch per unit effort for female individuals at the given station and week, recorded as a decimal value.
both_cpue_dataset_for_table_6_prediction_2.csv
This dataset provides source data that are subjected to permutation resampling designed to test prediction 2 in Table 6 of the paper, that mean spawning date based on the abundance of all females is later than mean spawning date based on the abundance of spawning females only.
Variables:
- year: Calendar year in which sampling occurred, recorded as a four-digit integer (e.g., 2019).
- week: Week of the year in which sampling occurred, recorded as an integer value.
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- fem_cpue: Catch per unit effort for female individuals at the given station and week, recorded as a decimal value.
- spawnerfem_cpue: Catch per unit effort for spawning female individuals at the given station and week, recorded as a decimal value.
fem_cpue2_dataset_for_table_6_prediction_3.csv
This dataset provides source data that are subjected to permutation resampling designed to test prediction 3 in Table 6 of the paper, that mean spawning date based on the abundance of all females is later than mean spawning date based on the abundance of their ovary mass.
Variables:
- year: Calendar year in which sampling occurred, recorded as a four-digit integer (e.g., 2019).
- week: Week of the year in which sampling occurred, recorded as an integer value.
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- fem_cpue: Catch per unit effort for female individuals at the given station and week, recorded as a decimal value.
- ovary_cpue: Catch per unit effort based on ovary data at the given station and week, recorded as a decimal value.
femovary_cpue3_dataset_for_table_6_prediction_4.csv
This dataset provides source data that are subjected to permutation resampling designed to test prediction 3 in Table 6 of the paper, that mean spawning date based on the abundance of all spawning females is later than mean spawning date based on the abundance of their ovary mass.
Variables:
- year: Calendar year in which sampling occurred, recorded as a four-digit integer (e.g., 2019).
- week: Week of the year in which sampling occurred, recorded as an integer value.
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- fem_cpue: Catch per unit effort for female individuals at the given station and week, recorded as a decimal value.
- spawnerovary_cpue: Catch per unit effort for spawning individuals based on ovary data at the given station and week, recorded as a decimal value.
entries_table_T2.csv
This dataset provides values that are displayed in Supplementary Table T2 of the paper, which provides results of statistical modeling of temporal change in the relative abundance of Blueback Herring females in four sites of the Connecticut River over four years.
Variables:
- region: Sampling region to which the model applies, recorded as an uppercase text label (values LOWER and UPPER).
- Station: Sampling station code for which the model was fit, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- Parameter: Model parameter or effect name, recorded as text (e.g., Intercept, week, year, week*year).
- Estimate: Estimated coefficient value for the model parameter, recorded as a numeric value; zero or missing values may indicate reference levels.
- Standard Error: Standard error of the parameter estimate, recorded as a decimal value; missing values (.) indicate non-estimable parameters.
- Wald Chi-Square: Wald chi-square test statistic associated with the parameter, recorded as a decimal value; missing values (.) indicate non-estimable parameters.
- Pr > ChiSq: P-value associated with the Wald chi-square test, recorded as a decimal value or threshold notation (e.g., <.0001); missing values (.) indicate non-estimable parameters.
sample_size_entries_table_T2.csv
This dataset provides additional values that are displayed in Supplementary Table T2 of the paper which provides results of statistical modeling of temporal change in the relative abundance of Blueback Herring females in four sites of the Connecticut River over four years.
Variables:
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER, UPPER).
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- N(weeks): Number of weeks included in the dataset for the given station and region, recorded as an integer value.
- N(observations): Total number of observations collected for the given station and region, recorded as an integer value.
entries_table_T4.csv
This dataset provides values that are displayed in Supplementary Table T4 of the paper, which provides results of statistical modeling of temporal change in reproductive stage among female Blueback Herring in four sites of the Connecticut River over four years.
Variables:
- region: Sampling region to which the model applies, recorded as an uppercase text label (values LOWER and UPPER).
- Station: Sampling station code for which the model was fit, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- Parameter: Model parameter or effect name, recorded as text (e.g., Intercept, week, year, week*year).
- Estimate: Estimated coefficient value for the model parameter, recorded as a numeric value; zero or missing values may indicate reference levels.
- Standard Error: Standard error of the parameter estimate, recorded as a decimal value; missing values (.) indicate non-estimable parameters.
- Wald Chi-Square: Wald chi-square test statistic associated with the parameter, recorded as a decimal value; missing values (.) indicate non-estimable parameters.
- Pr > ChiSq: P-value associated with the Wald chi-square test, recorded as a decimal value or threshold notation (e.g., <.0001); missing values (.) indicate non-estimable parameters.
sample_size_entries_table_T4.csv
This dataset provides additional values that are displayed in Supplementary Table T4 of the paper, which provides results of statistical modeling of temporal change in reproductive stage among female Blueback Herring in four sites of the Connecticut River over four years.
Variables:
- region: Sampling region, recorded as an uppercase text label (values LOWER and UPPER, UPPER).
- Station: Sampling station code, recorded as an uppercase text abbreviation (values CHI for Chicopee River, FAR for Farmington River, WES for Westfield River, WET for Wethersfield Cove).
- N(weeks): Number of weeks included in the dataset for the given station and region, recorded as an integer value.
- N(observations): Total number of observations collected for the given station and region, recorded as an integer value.
entries_table_T9A.csv
This dataset provides values that are displayed in Supplementary Table T9A of the paper, which provides statistics on intercorrelations among four metrics of reproductive activity at two sites of the Connecticut River over four years.
Variables:
- year: Calendar year associated with the correlation results, recorded as a four-digit integer (e.g., 2019).
- NAME: Name of the focal variable for the corresponding row in the correlation matrix, recorded as text (e.g., prop_female, prop_spawner, all_gsi).
- wet_prop_female: Correlation coefficient between the focal variable and the proportion of females at the Wetherfield Cove station, recorded as a decimal value between −1 and 1.
- wet_prop_spawner: Correlation coefficient between the focal variable and the proportion of spawners at the Wetherfield Cove station, recorded as a decimal value between −1 and 1.
- wet_all_gsi: Correlation coefficient between the focal variable and mean gonadosomatic index (GSI) for all individuals at the Wethersfield Cove station, recorded as a decimal value between −1 and 1.
- wet_spawner_gsi: Correlation coefficient between the focal variable and mean gonadosomatic index (GSI) for spawning individuals at the Wethersfield Cove station, recorded as a decimal value between −1 and 1.
- farprop_female: Correlation coefficient between the focal variable and the proportion of females at the Farmington River station, recorded as a decimal value between −1 and 1.
- farprop_spawner: Correlation coefficient between the focal variable and the proportion of spawners at the Farmington River station, recorded as a decimal value between −1 and 1.
- farall_gsi: Correlation coefficient between the focal variable and mean gonadosomatic index (GSI) for all individuals at the Farmington River station, recorded as a decimal value between −1 and 1.
- farspawner_gsi: Correlation coefficient between the focal variable and mean gonadosomatic index (GSI) for spawning individuals at the Farmington River station, recorded as a decimal value between −1 and 1.
entries_table_T9B.csv
This dataset provides values that are displayed in Supplementary Table T9B of the paper, which provides statistics on intercorrelations among four metrics of reproductive activity at two sites of the Connecticut River over four years.
Variables:
- year: Calendar year associated with the correlation results, recorded as a four-digit integer (e.g., 2019).
- NAME: Name of the focal variable for the corresponding row in the correlation matrix, recorded as text (e.g., prop_female, prop_spawner, all_gsi, spawner_gsi).
- wes_prop_female: Correlation coefficient between the focal variable and the proportion of females at the Westfield River station, recorded as a decimal value between −1 and 1; missing values (.) indicate insufficient data.
- wes_prop_spawner: Correlation coefficient between the focal variable and the proportion of spawners at the Westfield River station, recorded as a decimal value between −1 and 1; missing values (.) indicate insufficient data.
- wes_all_gsi: Correlation coefficient between the focal variable and mean gonadosomatic index (GSI) for all individuals at the Westfield River station, recorded as a decimal value between −1 and 1; missing values (.) indicate insufficient data.
- wes_spawner_gsi: Correlation coefficient between the focal variable and mean gonadosomatic index (GSI) for spawning individuals at the Westfield River station, recorded as a decimal value between −1 and 1; missing values (.) indicate insufficient data.
- chiprop_female: Correlation coefficient between the focal variable and the proportion of females at the Chicopee River station, recorded as a decimal value between −1 and 1.
- chiprop_spawner: Correlation coefficient between the focal variable and the proportion of spawners at the Chicopee River station, recorded as a decimal value between −1 and 1.
- chiall_gsi: Correlation coefficient between the focal variable and mean gonadosomatic index (GSI) for all individuals at the Chicopee River station, recorded as a decimal value between −1 and 1.
- chispawner_gsi: Correlation coefficient between the focal variable and mean gonadosomatic index (GSI) for spawning individuals at the Chicopee River station, recorded as a decimal value between −1 and 1.
Code/Software
site map and inset.r
This script generated the map that is Figure 1 of the paper, showing the location of four sampled sites along the Connecticut River watershed.
Rstudio version:*
RStudio 2026.01.0+392 "Apple Blossom" Release (49fbea7a09a468fc4d1993ca376fd5b971cb58e3, 2026-01-04) for windows
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) RStudio/2026.01.0+392 Chrome/140.0.7339.249 Electron/38.7.2 Safari/537.36, Quarto 1.8.25
Required R packages and their uses:
- dataRetrieval (version 2.7.17 or higher): Retrieval Functions for USGS and EPA Hydrology and Water Quality Data
- dplyr (version 1.1.4 or higher) : A Grammar of Data Manipulation
- ggplot2 (version 4.0.1 or higher) : Create Elegant Data Visualisations Using the Grammar of Graphics
- ggspatial (version 1.1.9 or higher): Spatial Data Framework for ggplot2
- ggstar (version 1.0.4 or higher) : Multiple Geometric Shape Point Layer for 'ggplot2'
- lwgeom (version 0.2-14 or higher) : Bindings to Selected 'liblwgeom' Functions for Simple Features
- maps (version 3.4.2.1 or higher) : Draw Geographical Maps
- nhdplusTools (version 1.3.0 or higher) : NHDPlus Tools
- rnaturalearth (version 1.0.1 or higher) : World Map Data from Natural Earth
- rnaturalearthdata (version 1.0.0 or higher) : World Vector Map Data from Natural Earth Used in 'rnaturalearth'
- sf (version 1.0-19 or higher) : Simple Features for R
SAS_code_Figure_2.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses dataset indiv_data to generate estimates that are presented in Figure 2 of the paper, which represents the proportion and standard deviation values of Blueback Herring that are female versus calendar week, plotted by year, in Lower Region sites of the Connecticut River.
SAS_code_Figure_3.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses dataset indiv_data to generate estimates that are presented in Figure 3 of the paper, which represents the proportion and standard deviation values of Blueback Herring females that are spawning female versus calendar week, plotted by year, in Lower Region sites of the Connecticut River.
SAS_code_Figure_4.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses datasets indiv_data and dissection_data to generate estimates that are presented in Figure 4 of the paper, which represents the gonosomatic index mean and standard deviation values of all Blueback Herring females and of all spawner Blueback Herring females versus calendar week, plotted by year, in Lower Region sites of the Connecticut River.
SAS_code_Figure_5.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses datasets run_data, indiv_data, and dissection_data to generate estimates that are presented in Figure 5 of the paper, which represents five phenology metrics, as mean and standard error values of spawning week, plotted by year, in Lower Region sites of the Connecticut River.
SAS_code_table_1.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses dataset run_data to generate estimates that are presented in Table 1 of the paper, which represents weekly abundance of adult Blueback Herring at four sites in the Connecticut River over four years.
SAS_code_for_Table_2_and_Table_T3.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses dataset indiv_data to generate estimates that are presented in Table 2 and Supplementary Table T3 of the paper, which represent temporal change in sex ratio among adult Blueback Herring in four sites of the Connecticut River over four years.
SAS_code_table_3.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses dataset indiv_data to generate estimates that are presented in Table 3 of the paper, which represents temporal change in reproductive status among female Blueback Herring in two sites of the Connecticut River over four years.
SAS_code_for_Table_4_and_Table_T5_and_table_T6.MD
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses datasets indiv_data and dissection.data to generate estimates that are presented in Table 4 and Supplementary Tables T5 and T6 of the paper, which represent temporal change in reproductive investment among female Blueback Herring in four sites of the Connecticut River over four years and results of statistical modeling of the change.
SAS_code_for_Table_5_and_table_T7_and_table_T8.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses datasets indiv_data and dissection_data to generate estimates that are presented in Table 5 and Supplementary Tables T7 and T8 of the paper, which represent temporal change in reproductive investment among spawner female Blueback Herring in four sites of the Connecticut River over four years and results of statistical modeling of the change.
read_me_sas_macro_for_permutation_test.md
This text file provides information on how the permutation test of predictions in Table 6 of the paper was executed, using the macro utility in SAS.
SAS_code_for_generating_source_data_table_6_prediction_1.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses the datasets run_data and indiv_data to generate the dataset fem_cpue_dataset_for_table_6_prediction_1 that is permuted for a randomization test of prediction 1 in Table 6 of the paper that mean spawning date based on the abundance of all individuals is earlier than mean spawning date based on the abundance of females only.
SAS_code_for_table_6_prediction_1.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses datasets fem_cpue_dataset_for_table_6_prediction_1 to conduct a randomization test of prediction 1 in Table 6 of the paper that mean spawning date based on the abundance of all individuals is earlier than mean spawning date based on the abundance of females only.
SAS_code_for_generating_source_data_table_6_prediction_2.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses the datasets run_data and indiv_data to generate the dataset both_cpue_dataset_for_table_6_prediction_2 that is permuted for a randomization test of prediction 2 in Table 6 of the paper that mean spawning date based on the abundance of all females is later than mean spawning date based on the abundance of spawning females only.
SAS_code_for_table_6_prediction_2.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses datasets both_cpue_dataset_for_table_6_prediction_2 to test prediction 2 in Table 6 of the paper that mean spawning date based on the abundance of all females is later than mean spawning date based on the abundance of spawning females only to conduct a randomization test of prediction 1 in Table 6 of the paper that mean spawning date based on the abundance of all individuals is earlier than mean spawning date based on the abundance of females only.
SAS_code_for_generating_source_data_table_6_prediction_3.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses the datasets run_data, indiv_data, and dissection_data to generate the dataset fem_cpue2_dataset_for_table_6_prediction_3 that is permuted for a randomization test of prediction 3 in Table 6 of the paper that mean spawning date based on the abundance of all females is later than mean spawning date based on the abundance of their ovary mass.
SAS_code_for_table_6_prediction_3.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses dataset fem_cpue2_dataset_for_table_6_prediction_3 to test prediction 3 in Table 6 of the paper that mean spawning date based on the abundance of all females is later than mean spawning date based on the abundance of their ovary mass.
SAS_code_for_generating_source_data_table_6_prediction_4.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses the datasets run_data, indiv_data, and dissection_data generate the dataset femovary_cpue3_dataset_for_table_6_prediction_4 that is permuted for a randomization test of prediction 4 in Table 6 of the paper that that mean spawning date based on the abundance of all spawning females is later than mean spawning date based on the abundance of their ovary mass.
SAS_code_for_table_6_prediction_4.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses dataset femovary_cpue3_dataset_for_table_6_prediction_4 to test prediction 3 in Table 6 of the paper that mean spawning date based on the abundance of all females is later than mean spawning date based on the abundance of their ovary mass.
SAS_code_for_table_T2.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses the dataset indiv_data to model temporal change in the relative abundance of Blueback Herring females in four sites of the Connecticut River over four years, results of which are presented in Supplementary Table T2 of the paper.
SAS_code_for_table_T4.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses the dataset indiv_data to model temporal change in the reproductive stage of Blueback Herring females in four sites of the Connecticut River over four years, results of which are presented in Supplementary Table T4 of the paper.
SAS_code_for_table_T9.md
This file is a program in SAS for Windows, SAS 9.4 TS Level 1M7. It uses the dataset indiv_data and dissection_data to generate statistics on intercorrelations among four metrics of reproductive activity at four sites of the Connecticut River over four years, which are presented in Supplementary Table T9 of the paper.
Contact Information
For questions about this dataset or code, please contact*:
Eric Schultz
Email*: eric.schultz@uconn.edu
We collected Blueback Herring (Alosa aestivalis) along the Connecticut River in northern Connecticut and southern Massachusetts in 2019, 2021, 2022 and 2023 following the same sampling scheme. Four sites (Wethersfield Cove (WET), Lower Farmington River (FAR), Lower Westfield River (WES), Lower Chicopee River (CHI); the former two sites are designated as belonging to the LOWER region and the latter two sites are designated as belonging to the UPPER region) that are known spawning areas served as fixed sampling sites. At each site visit, a boat electrofishing (Smith-Root Model SR-18) field crew conducted daytime sampling using standardized methods; multiple electroshocking runs were conducted during each site visit. We collected basic data on Blueback Herring captured in all runs. We held captured individuals briefly in a live well; before moving to the next site, we measured, sexed, and scored females for reproductive stage based on external appearance and expression of gametes. Up to 80 fish per day were euthanized, put on ice and transported to the lab for further processing the next day.
