Data from: Individual weight estimates for Great Lakes benthic invertebrates

Hrycik, Allison 1 ; Burlakova, Lyubov2; Karatayev, Alexander2; Daniel, Susan2; Dermott, Ronald3; Tarbell, Morgan4; Hinchey, Elizabeth5

Published Aug 01, 2024 on Dryad. https://doi.org/10.5061/dryad.tx95x6b42

Data files

Aug 01, 2024 version files 35.76 KB

IndividualWeights_AllData.csv

17.84 KB
IndividualWeights_MajorGroups.csv

9.49 KB
README.md

3.98 KB
SpeciesList.csv

4.45 KB

Abstract

We present mean individual weights for common benthic invertebrates of the Great Lakes collected from over 2,000 benthic samples and eight years of data collection (2012-2019), both as species-specific weights and average weights of larger taxonomic groups of interest. The dataset we have assembled is applicable to food web energy flow models, calculation of secondary production estimates, interpretation of trophic markers, and for understanding how biomass distribution varies by benthic invertebrate species in the Great Lakes. A corresponding data paper describes comparisons of these data to benthic invertebrates in other lakes.

https://doi.org/10.5061/dryad.tx95x6b42

The files in this data set give information for benthic invertebrate weights for the Laurentian Great Lakes.
See companion data manuscript text for calculation methods. All files are formatted as comma-separated values and missing data are denoted with "NA."

Description of the data and file structure

Description for file "IndividualWeights_AllData.csv"
This file gives summary statistics of individual weights for species to the highest possible taxonomic
resolution, and separates by lake, depth, and/or basin when necessary. The column headers are as follows:
Name (character string) = Name of the taxonomic unit represented. Species in most cases.
Lake (character string) = Great Lake(s) represented in grouping. "All" indicates that all possible data from the Great Lakes was combined.
DepthZone_m (character string) = Depth zone represented in grouping, either >=70 m or < 70 m. "All" indicates that data from all lake depths were combined.
Basin (character string) = Lake basin represented in grouping. "All" indicates that data from all lake basins are included.
AverageIndividualWeight_g (numeric) = Mean weight of an individual benthic invertebrate of the given taxonomic unit in grams.
SE_weight (numeric) = Standard error of mean individual weight.
MedianIndividualWeight_g (numeric) = Median weight of an individual benthic invertebrate of the given taxonomic unit in grams.
MinimumIndividualWeight_g (numeric) = Minimum weight of an individual benthic invertebrate of the given taxonomic unit in grams.
MaximumIndividualWeight_g (numeric) = Maximum weight of an individual benthic invertebrate of the given taxonomic unit in grams.
SampleSize_Nsamples (integer) = Sample size for a given taxonomic unit. Represents the number of samples in which the taxonomic unit was present.
-----
Description for file "IndividualWeights_MajorGroups.csv"
This file has summary statistics for major taxonomic groups of benthic macroinvertebrates in different lakes, depth zones, and for Dreissena, the three basins of Lake Erie. The column headers are as follows:
Name (character string) = Name of the taxonomic unit represented.
Lake (character string) = Great Lake(s) represented in grouping. "All" indicates that all possible data from the Great Lakes was combined.
DepthZone_m (character string) = Depth zone represented in grouping, either >=70 m or < 70 m. "All" indicates that data from all lake depths were combined.
Basin (character string) = Lake basin represented in grouping. "All" indicates that data from all lake basins are included.
AverageIndividualWeight_g (numeric) = Mean weight of an individual benthic invertebrate of the given taxonomic unit in grams.
SE_weight (numeric) = Standard error of mean individual weight.
MedianIndividualWeight_g (numeric) = Median weight of an individual benthic invertebrate of the given taxonomic unit in grams.
MinimumIndividualWeight_g (numeric) = Minimum weight of an individual benthic invertebrate of the given taxonomic unit in grams.
MaximumIndividualWeight_g (numeric) = Maximum weight of an individual benthic invertebrate of the given taxonomic unit in grams.
SampleSize_Nsamples (integer) = Sample size for a given taxonomic unit. Represents the number of samples in which the taxonomic unit was present.
-----
Description for file "SpeciesList.csv"
This file shows which taxa were grouped for calculations. The column headers are as follows:
Name (character string) = Name of the taxonomic unit represented following the format “Genus species” or “Genus species subspecies” if subspecies is known. Abbreviations: sp. = unknown species; spp. = unknown species but genus likely encompasses multiple species.
Group (character string) = Higher taxonomic used to classify the species for calculations in file "IndividualWeights_MajorGroups.csv."

Data Collection

Benthic invertebrates were collected from the EPA R/V Lake Guardian from 2012-2019 as part of the EPA Great Lakes National Program Office GLBMP and Cooperative Science and Monitoring Initiative (CSMI) benthic surveys. GLBMP samples are collected in all five of the Great Lakes annually and CSMI samples are collected in one of the Great Lakes annually. GLBMP includes 57-63 stations each year: 11 in Lake Superior (and 2-7 additional stations since 2014), 11 in Lake Huron, 16 in Lake Michigan, 10 in Lake Erie, and 10 (9 since 2015) in Lake Ontario. The number of CSMI stations vary by year. CSMI surveys for each lake took place in the following years: Erie 2014 (97 stations), Michigan 2015 (140 stations), Superior 2016 (59 stations), Huron 2017 (118 stations), and Ontario 2018 (46 stations). Additional CSMI surveys have occurred since 2019, however, we did not include these survey data in our analysis because samples would be unbalanced with some lakes sampled twice and other lakes sampled only once. We followed EPA Standard Operating Procedures for Benthic Invertebrate Field Sampling SOP LG406 (U.S. EPA, 2021). In short, triplicate samples were collected from each station using a Ponar grab (sampling area = 0.0523 m2 for all surveys except Lake Michigan CSMI, for which sampling area = 0.0483 m2) then rinsed through 500 µm mesh. Samples were preserved with 5-10% neutral buffered formalin with Rose Bengal stain.

Lab Processing

Samples were processed in the lab after preservation following EPA Standard Operating Procedure for Benthic Invertebrate Laboratory Analysis SOP LG407 (U.S. EPA, 2015). Briefly, organisms were picked out of samples using a low-magnification dissecting microscope then each organism was identified to the finest taxonomic resolution possible (usually species). Individuals of the same species, or size category, were blotted dry on cellulose filter paper to remove external water until the wet spots left by animal(s) on the absorbent paper disappeared. Blotting time varied based on the surface area/volume ratio of the organisms but was approximately one minute for large and medium chironomids and oligochaetes and less time (0.6 min) for smaller chironomids and oligochaetes. Care was taken to ensure that the procedure did not cause damage to the specimens. Larger organisms (e.g., dreissenids) often took longer to blot dry. All organisms in a sample within a given taxonomic unit were weighed together to the nearest 0.0001 g (WW). Dreissena were weighed by 5 mm size category (size fractions: 0-4.99 mm, 5-9.99 mm, etc.) to nearest 0.0001 g (shell and tissue WW).

Data Analysis

To calculate the total weight for each species that was mounted on slides by size groups for identification (e.g., Oligochaeta, Chironomidae), we multiplied the number of individuals of the species binned into each size category by the average weight of individuals in that category. If a species was found in more than one size category, we summed the weight of the species across all categories per sample. Oligochaetes often fragment in samples, and thus, were counted by tallying the number of oligochaete heads (anterior ends with prostomium) present in the sample. Oligochaete fragments were also counted and weighed for inclusion in biomass calculations. We set the cutoff for the minimum number of samples to calculate individual weights to ten samples (see companion data paper for details). Therefore, in our further analysis we only calculated individual weights when a taxonomic unit was found in at least ten samples. Species that were found in fewer than ten samples were excluded from the analysis.

We calculated wet weights by species whenever possible. If species were closely related, had similar body size (based on our previous experience), and were found in few samples, they were grouped together to achieve our minimum sample size of ten. For some taxa (e.g., Chironomidae), individual species could not be identified so calculations were made at the finest taxonomic resolution possible (usually genus). We hereafter refer to the two taxonomic groupings of closely related species and taxa that could not be identified to species as “taxonomic units.” For each taxonomic unit, we calculated several summary statistics on wet weight: mean, minimum, and maximum weight, median weight, standard error of mean weight, and sample size (number of samples in which a taxonomic unit was present).

We performed Kruskal-Wallis tests (Kruskal & Wallis, 1952) to determine when individuals within a species could be grouped by depth zone and/or lake when sample size was large enough (species found in ≥10 samples per group) to permit splitting because we expected species weight to differ by depth zone and/or lake. In all five Great Lakes, benthic density and species richness are greater at stations ≤70 m than at stations deeper than 70 m (Burlakova et al., 2018; Cook & Johnson, 1974). The 70 m depth contour separation of benthos mirrors a breakpoint in spring chlorophyll concentrations observed for these stations, suggesting that lake productivity is likely the major driver of benthic abundance and diversity across lakes (Burlakova et al., 2018). Therefore, we used two categories of depth zones: ≤70 m and > 70 m. If Kruskal-Wallis tests showed that weights did not differ by lake or depth, the average weight for a species was calculated as an average of all lakes and depths. If Kruskal-Wallis tests showed significant separation (α < 0.05) by lake or depth, then means were calculated for each group and we also compared the group means. Individuals in different lakes or depth zones were combined if the mean difference between most groups was less than 25%, even when Kruskal-Wallis tests were significant because small differences were likely not biologically significant. Oligochaete fragments for finer taxonomic units were reported separately from oligochaete species because it was rarely apparent which species the fragments came from.

Mean individual wet weights were calculated for a total of 187 groupings within taxonomic units (data file “IndividualWeights_AllData.csv”). For 117 taxonomic units, weights were calculated across all lakes, depths, and basins because weights were similar in all regions or because of small sample size, for seven taxonomic units, weights were calculated by lake, and for the rest summary statistics were calculated by both lake and depth zone. In addition, five species were considered as “special cases” where some areas were similar while others were not. For example, some species had similar weights in multiple lakes, thus those lakes were grouped together while other were kept separate. Dreissena rostriformis bugensis weights were calculated by lake and depth zone except for Lake Erie, where the western, central, and eastern basins were separated because previous research demonstrated that D. rostriformis bugensis size structure is drastically different in each of Lake Erie’s basins (Karatayev et al., 2021). Other special cases were: Heterotrissocladius marcidus group (Huron, Michigan, and Ontario were similar and grouped together, while mean weight in Lake Superior was different), Pisidium spp. (grouped as Ontario/Michigan, Erie, and Huron/Superior), Unidentified Chironomidae (Lake Erie was separated and all other lakes were grouped together), and Spirosperma ferox (Lake Erie was separated and all other lakes were grouped together).

To calculate mean individual weights for commonly reported larger taxonomic groups (e.g., Oligochaeta, Chironomidae), we combined species or taxonomic units that belonged to this group (see “SpeciesList.csv” for information on groupings). Summary statistics were calculated on the mean individual weight for all individuals within a group in a given sample, i.e., total biomass for a given group was divided by total density for that group, repeated for each sample. Results are given for each major group as a mean/minimum/maximum for each lake, and for each depth zone within each lake as groups are often made up of different species with different body sizes in each lake and depth zone. Because densities of oligochaetes were counted based on the number of oligochaetes with heads in a sample (excluding fragments), but the fragments were weighed to calculate biomass, the mean individual weight for oligochaetes within a sample was calculated by dividing the weight of all oligochaetes (including fragments) in a sample by the number of oligochaetes (not including fragments).

Calculations of mean individual weight by major group were performed both by lake and lake plus depth zone (data file “IndividualWeights_MajorGroups.csv”). Summary statistics were reported for 14 major taxa and were broken down by depth zone when sample size was sufficient (data file “IndividualWeights_MajorGroups.csv”).

REFERENCES

Burlakova, L. E., Barbiero, R. P., Karatayev, A. Y., Daniel, S. E., Hinchey, E. K., & Warren, G. J. (2018). The benthic community of the Laurentian Great Lakes: Analysis of spatial gradients and temporal trends from 1998 to 2014. Journal of Great Lakes Research, 44(4), 600–617. https://doi.org/10.1016/j.jglr.2018.04.008

Cook, D. G., & Johnson, M. G. (1974). Benthic Macroinvertebrates of the St. Lawrence Great Lakes. Journal of the Fisheries Research Board of Canada, 31, 763–782. https://doi.org/10.1139/f74-101

Karatayev, A. Y., Burlakova, L. E., Mehler, K., Hinchey, E. K., Wick, M., Bakowska, M., & Mrozinska, N. (2021). Rapid assessment of Dreissena population in Lake Erie using underwater videography. Hydrobiologia, 848(9), 2421–2436. https://doi.org/10.1007/s10750-020-04481-x

Kruskal, W. H., & Wallis, W. A. (1952). Use of Ranks in One-Criterion Variance Analysis. Journal of the American Statistical Association, 47(260), 583–621. https://doi.org/10.1080/01621459.1952.10483441

USEPA. (2015). Standard Operating Procedure for Benthic Invertebrate Laboratory Analysis. (LG 407, Version 09). https://www.epa.gov/sites/default/files/2017-01/documents/sop-for-benthic-invertebrate-lab-analysis-201504-13pp.pdf

USEPA. (2021). Standard Operating Procedure for Benthic Invertebrate Field Sampling. (LG 406, Version 14). https://www.epa.gov/system/files/documents/2021-08/lg406.v14-benthos-sampling_rfa.pdf

Data from: Individual weight estimates for Great Lakes benthic invertebrates

Data files

Abstract

README: Data from: Individual weight estimates for Great Lakes benthic invertebrates

Description of the data and file structure

Methods

Works referencing this dataset