Absolute fish population censuses in ponds demonstrate eDNA metabarcoding provides biodiversity estimates comparable to conventional sampling methods
Data files
Sep 29, 2025 version files 6.36 GB
-
Drained_Ponds_Raw_Data.csv
2 KB
-
Master_eDNA_sample_datasheet_.csv
1.32 MB
-
MetabarcodingMasterDataFileReadCounts.csv
55.47 KB
-
R_2023_09_26_09_52_47_user_GSS5-0202-54-400bp_850_GF_5_9_12PS1_Sept25_2023.530.tar.bz2
3.07 GB
-
R_2023_09_26_15_33_31_user_GSS5-0202-55-400bp_850_GF_6_13_12MMito_Sept25_2023.530.tar.bz2
3.29 GB
-
README.md
18.42 KB
-
Seine.Electrofishing_Raw_Data.csv
5.36 KB
Abstract
Freshwater fishes are experiencing an unprecedented decline. Effective strategies for estimating the composition of fish communities are crucial for conservation efforts. While conventional physical collection methods can be effective, environmental DNA analysis has emerged as a potential alternative. Systematic comparisons of these methods are critical to evaluate their relative effectiveness. We collected water samples from 14 ponds near Hamilton, ON, which were analyzed for fish eDNA using two universal metabarcoding primers, and sampled using conventional methods (electrofishing and/or seining). A subset of ponds were then drained to obtain population census counts, facilitating a standardized comparison of false negatives and positives across conventional and eDNA sampling. We found no significant difference between survey methods in the number of species detected, although eDNA was more effective at detecting species at extremely low abundances, but was prone to false positives. We estimate that 11 eDNA samples should be sufficient to quantify fish community biodiversity in similar ecosystems. eDNA represents a viable alternative or complement to conventional methods, reinforcing its potential for enhancing aquatic ecosystem management.
https://doi.org/10.5061/dryad.w9ghx3g07
Description of the data and file structure
I. General Information
Data collection dates: 2021 - 2023
Geographic location of data collection: Hamilton, ON, CA & Windsor, ON, CA
Files and variables
II. Data overview and methodological information
File: Drained_Ponds_Raw_Data.csv
Date of creation: Feb 9, 2024
The dataset contains 1 tab page. Empty cells indicate that a species was not detected at a specific size class within a pond.
- Pond_ID: Unique identifier for each pond that was drained. These IDs are used across all datasets. Note Pond 104 was sampled in two different years and was noted by either (2021) or (2022) next to the ID.
- Forebay/Main Cell: Indicates whether the pond had a forebay that was drained to help determine where fish were found within a pond. If a pond did not have a forebay it was not included (see 5A).
- Species: Common name of the species collected after draining.
- YOY 1 - 3: Individual collected was considered a 'young-of-the-year'. YOY classes were divided by fork length measurements where: YOY1 = <25mm, YOY2 = 25-43mm, YOY3 = >43-50mm.
- size_1 - 6: Individual fork lengths for each collected fish of that species. Size classes were divided by fork length where: size_1 = >50mm, size_2 = >50-75mm, size_3 = >75-100mm, size_4 = >100-150mm, size_5 = >150-200mm, size_6 = >200mm.
- Not specified: Individual fish species for which fork length measurement had not been measured, but had still been collected during draining.
- Total N by Species: Cumulative count of that species over all size/YOY classes.
- Date: Date of collection following draining of the respective pond.
- Comments: Any relevant information regarding the draining/collection/species.
File: Master_eDNA_sample_datasheet_.csv
Date of creation: February 1, 2023
The dataset contains 1 tab page. Cells containing NA, refer to 'Not Available', and indicates that data was not available/recorded.
- Group: Designation of research group that collected eDNA sample.
- Project Name: Name of project that utilized eDNA sample.
- Site Code: ID number of pond that eDNA sample was collected from.
- Sample Code: Sample ID of individual eDNA sample within a pond. Note - samples were taken in triplicate for each sample code, multiple entries exist to reflect this.
- Collection Date: Date of eDNA sample collection.
- Waterbody Name: Waterbody that eDNA sample was collected from, in this case corresponds to Pond ID.
- Arrival Time: Time at which sampling team arrived at pond for sampling activities (In 12H notation).
- Departure time: Time at which sampling team left pond following sampling activities (In 12H notation).
- Narrative Locality: Location of eDNA samples if need to clarify beyond Waterbody Name.
- Field Crew Members: List of team members who collected eDNA sample.
- Number of Crew: Number of team members who collected eDNA sample.
- Collector: Lead team member responsible for overseeing collection of eDNA sample.
- Recorder: Team member who filled out field sheets during collection of eDNA sample.
- Sampling Start Time: Time at first eDNA sample collection for a given sampling session.
- Sampling Stop Time: Time at last eDNA sample collection for a given sampling session.
- Overnight: Indication that sampling took place overnight.
- Latitude: Location of pond eDNA sample was taken at - latitude in deg/min/sec.
- Longitude: Location of pond eDNA sample was taken at - longitude in deg/min/sec.
- Paired Site?: Indication that the site sampled was paired with another sampling location.
- Upstream Paired Site: If there was a paired site, what the upstream sampling location was, if one exists.
- Downstream Paired Site: If there was a paired site, what the downstream sampling location was, if one exists.
- Gear Type: Gear used to filter eDNA sample from water.
- Pre-filtered: Indicates if there was any pre-filtering of eDNA water sample prior to eDNA capture due to high amounts of sediment in water or debris that would make filtration impossible.
- Pore Size: Size of pores in filter chosen for eDNA sample (in µm).
- Amount Filtered: Amount of water filtered to collect eDNA sample (in mL).
- Replicate Number: If one of a triplicate sample, the ID number of the particular replicate.
- Filtering Time: The amount of time required to completely filter the sample.
- Filtering Pressure: Pressure used to filter eDNA sample (in kpa).
- Where Filtered: Location of water filtering to collect eDNA sample.
- Filtering Date: Date of water filtering to collect eDNA sample.
- Preservation Method: Method used to preserve eDNA sample before transfer to freezer.
- Date Frozen: Date at which sample was stored in freezer.
- Air Temperature: The air temperature measured above the pond.
- Water Temperature: At each pond, water temperature was measured using a YSI unit. Each row represents a unique measurement within the pond.
- Conductivity: At each pond, conductivity was measured using a YSI unit. Each row represents a unique measurement within the pond.
- pH: At each pond, pH was measured using a YSI unit. Each row represents a unique measurement within the pond.
- Dissolved Oxygen: At each pond, dissolved oxygen was measured using a YSI unit. Each row represents a unique measurement within the pond.
- Secchi Disc/Tube: The maximum depth observed using a Secchi disc or tube as a measure of turbidity for a pond.
- Turbidity: At each pond, turbidity was measured using a YSI unit. Each row represents a unique measurement within the pond.
- Chlorophyll a: The measure of chlorophyll a content within the pond.
- Other: A place to put any other relevant observations or measurements for a given pond.
- Other: A place to put any other relevant observations or measurements for a given pond.
- Other: A place to put any other relevant observations or measurements for a given pond.
- Dominant: Describes visually the majority composition of the substrate of the ponds.
- %: Describes visually the percent composition of the the substrate that matches the Dominant column.
- Type: Describes the first minor composition of the substrate of the ponds.
- %: Describes visually the percent composition of the the substrate that matches the previous Type column.
- Type: Describes the second minor composition of the substrate of the ponds.
- %: Describes visually the percent composition of the the substrate that matches the previous Type column.
- Max Site Depth: The maximum depth of the pond if measured.
- Distance from Shore: The distance from shore that the sample was collected.
- Depth Sampled: Indicates the depth at which the eDNA sample water was collected.
- Stratification: A indication as to if stratification was present in the pond. A "n" indicates there was no stratification present, while a "y" indicates it was present.
- Above Thermocline: A indication as to whether the eDNA sample was collected above or below the thermocline. A "n" indicates that the sample was not collected above, or that there was no thermocline, while a "y" indicates that the sample was collected above the thermocline.
- Thermocline Depth: If present, the depth at which the thermocline was observed. A "NA" indicates that no thermocline was observed.
- Habitat Classification: Category of environment sampled.
- Stream Width: The measure of how wide the stream was at its maximum.
- Max Site Depth: The maximum depth for the stream.
- Distance From Shore Sampled: The distance that the sample was taken from shore.
- Depth Sampled: The depth at which the sample was collected.
- Habitat type: Category of environment sampled.
- Stream Flow: A description of how fast the stream was flowing.
- Stream Flow Measurements: The speed at which the stream was flowing in km/h.
- Flow Relative to Typical levels: The flow of the stream compared to historic levels.
- Depth 1: The depth of the stream at the first measurement location.
- Depth 2: The depth of the stream at the second measurement location.
- Depth 3: The depth of the stream at the third measurement location.
- Depth 4: The depth of the stream at the fourth measurement location.
- Depth 5: The depth of the stream at the fifth measurement location.
- Velocity 1: The speed of the stream at the first measurement location.
- Velocity 2: The speed of the stream at the second measurement location.
- Velocity 3: The speed of the stream at the third measurement location.
- Dominant Aquatic type: The most common biological component of the ecosystem for a given sample.
- Aquatic Type 1: The most common type of living thing present within the pond if present.
- %: The approximate percentage of the biological component in regards to Aquatic Type 1.
- Aquatic Type 2: The most common type of living thing present within the pond if present.
- %: The approximate percentage of the biological component in regards to Aquatic Type 2.
- Aquatic Type 3: The most common type of living thing present within the pond if present.
- %: The approximate percentage of the biological component in regards to Aquatic Type 3.
- Aquatic Type 4: The most common type of living thing present within the pond if present.
- %: The approximate percentage of the biological component in regards to Aquatic Type 4.
- Riparian Type 1: The most common type of living thing present on the edges of the pond if present.
- %: The approximate percentage of the biological component in regards to Riparian Type 1.
- Riparian Type 2: The most common type of living thing present on the edges of the pond if present.
- %: The approximate percentage of the biological component in regards to Riparian Type 2.
- Riparian Type 3: The most common type of living thing present on the edges of the pond if present.
- %: The approximate percentage of the biological component in regards to Riparian Type 3.
- Riparian Type 4: The most common type of living thing present on the edges of the pond if present.
- %: The approximate percentage of the biological component in regards to Riparian Type 4.
- Aquatic Animals Type 1: The most common animal observed in the pond.
- Aquatic Animals Type 1: The percentage makeup of the most common animal observed in the pond.
- Aquatic Animals Type 2: The most common animal observed in the pond.
- Aquatic Animals Type 2: The percentage makeup of the second most common animal observed in the pond.
- Aquatic Animals Type 3: The most common animal observed in the pond.
- Aquatic Animals Type 3: The percentage makeup of the third most common animal observed in the pond.
- Floodplain Use: Category of use of the floodplain if relevant.
- Bank Slope: The approximate slope of the bank if present.
- Channel Cover: The approximate percentage of channel cover if present.
- Wind Speed: Estimate of speed of wind if present (in km/h).
- Wind Direction: Estimate of direction of wind if present.
- Current Weather Conditions: Describes weather during time of sample collection.
- Past 24h Weather Conditions: Describes weather conditions leading up to sample collection within 1 day if relevant.
- Site Photo Number: The number of the photo that corresponds to this sampling location.
- Habitat Photo File Number: The number of the photo that corresponds to this sampling location regarding habitat.
- Algae Photo File Number: The number of the photo that corresponds to this sampling location regarding Algae present.
- Water Bottom Photo File Number: The number of the photo that corresponds to this sampling location regarding the bottom of the pond.
- Substrate Photo File Number: The number of the photo that corresponds to this sampling location regarding the substrate of the pond.
- Filter Photo Number: The number of the photo that corresponds to this sampling filter.
- Final Notes: Any relevant information or comments for a given sample.
- Tube Label Completed: If sample is stored in Falcon Tubes, an indication that the tube has been labeled correctly.
- Field Sheet Completed (Y/N): Indicates that field sheet was completed and uploaded.
File: Seine.Electrofishing_Raw_Data.csv
Date of creation: Apr 15, 2024
The dataset contains 1 tab page. Empty cells indicate that a species was not detected at a specific size class within a pond.
Variables
- Pond: Unique identifier for each pond that was drained. These IDs are used across all datasets.
- Drained: Blank cells indicate that pond was not drained. An 'x' indicates that the pond was drained.
- Date: Date at which the pond had been surveyed using the conventional methods (electrofishing or seining).
- Cell: Indicates where the fish were collected, if the pond had multiple cells.
- Gear: Indicates the method of conventional survey used to collect the fish present in that row.
- LNG: Short for 'longnose gar'. Number of longnose gar collected from the respective pond/gear choice.
- WS: Short for 'white sucker'. Number of white sucker collected from the respective pond/gear choice.
- RHS: Short for 'redhorse sucker'. Number of redhorse sucker collected from the respective pond/gear choice.
- GF: Short for 'goldfish'. Number of goldfish collected from the respective pond/gear choice.
- CC: Short for 'common carp'. Number of common carp collected from the respective pond/gear choice.
- RCH: Short for 'river chub'. Number of river chub collected from the respective pond/gear choice.
- GSH: Short for 'golden shiner'. Number of golden shiner collected from the respective pond/gear choice.
- CSH: Short for 'common shiner'. Number of common shiner collected from the respective pond/gear choice.
- BNM: Short for 'bluntnose minnow'. Number of bluntnose minnow collected from the respective pond/gear choice.
- FHM: Short for 'fat-nose minnow'. Number of fat-nose minnow collected from the respective pond/gear choice.
- BNS: Short for 'black-nose shiner'. Number of black-nose shiner collected from the respective pond/gear choice.
- CCH: Short for 'creek chub'. Number of creek chub collected from the respective pond/gear choice.
- BB: Short for 'brown bullhead'. Number of brown bullhead collected from the respective pond/gear choice.
- CHC: Short for 'channel catfish'. Number of channel catfish collected from the respective pond/gear choice.
- BST: Short for 'brook stickleback'. Number of brook stickleback collected from the respective pond/gear choice.
- WP: Short for 'white perch'. Number of white perch collected from the respective pond/gear choice.
- GRS: Short for 'green sunfish'. Number of green sunfish collected from the respective pond/gear choice.
- PS: Short for 'pumpkinseed'. Number of pumpkinseed collected from the respective pond/gear choice.
- BG: Short for 'bluegill'. Number of bluegill collected from the respective pond/gear choice.
- LMB: Short for 'largemouth bass'. Number of largemouth bass collected from the respective pond/gear choice.
- BC: Short for 'black crappie'. Number of black crappie collected from the respective pond/gear choice.
- PSxGRSxBG: Short for a hybrid of 'pumpkinseed', 'green sunfish' and 'bluegill'. Number of these hybrids collected from the respective pond/gear choice.
- YP: Short for 'yellow perch'. Number of yellow perch collected from the respective pond/gear choice.
- Total: Total number of fish collected from the respective pond/gear choice.
- NSP: Number of species present in the pond/gear choice. An asterixis (*) indicates that there is a comment relevant to the total number of species found in the next column.
- Comments: Any relevant information regarding the collection or species from the ponds.
- Common name: Shortcut for the acronyms used in the columns.
- Acronym: Shortcut for the acronyms used in the columns.
File: R_2023_09_26_15_33_31_user_GSS5-0202-55-400bp_850_GF_6_13_12MMito_Sept25_2023.530.tar.bz2
Description: Output of the sequencing of the M-Mito primer amplification products on a Thermofisher Ion Torrent at the University of Windsor Environmental Genomics Facility. Produced September 25, 2023.
R_2023_09_26_09_52_47_user_GSS5-0202-54-400bp_850_GF_5_9_12PS1_Sept25_2023.530.tar.bz2
Description: Output of the sequencing of the PS1 primer amplification products on a Thermofisher Ion Torrent at the University of Windsor Environmental Genomics Facility. Produced September 25, 2023.
File: MetabarcodingMasterDataFileReadCounts.csv
Date of creation: Feb 22, 2024
The dataset contains 1 tab page. Cells containing NA, refer to 'Not Available', and indicates that data was originally available for a species, but had been removed due to updated identification methods.
Variables
- Pond: Location that species was detected. Pond ID follows this format - AA_B.CC - AA refers to the designated pond number, B refers to the month of sample collection, and CC refers to the last two digits of the year of the sample collection.
- Site: The unique sampling location within a pond that the sample was collected at.
- Sample: A unique ID for each detection per pond by species and DNA marker.
- Plate#: ID number of the plate that the sample was run on.
- Location.Plate: Number of the well that the sample was run on.
- Lab.Input: Amount of mL used for the DNA amplification as determined through the gel electrophoresis method.
- Species: The species being detected.
- Meta.Hits: The number of raw reads detected after metabarcoding.
- Standard.Meta.Hit: The corrected number of reads after considering the initial lab input volume.
- Marker: The ID of the marker used for a particular read count, either Mito or PS1, as well as the data file that it was read on, either 1 or 2.
Code/software
- tar.biz2 Files: QIMME2
- .xlm Files: Microsoft Excel
Metabarcoding data: Used the GenCatch Gel Extraction Kit and manufacturers protocol to prepare for reading on Thermofisher Ion Torrent at University of Windsor. Data included is raw data taken from this equipment before any of the data cleaning stages included in manuscript.
eDNA samples: Collected from stormwater drainage ponds from/near Hamilton, ON in 2021 & 2022. Filtered water on-site through glass fiber filter and desicated with indicating silica beads. Transported to Windsor and kept at -20C until proccessed. DNA extracted in sterlile eDNA room at GLIER near University of Windsor. Used DNeasy kits following provided protocol for extraction, DNA was amplified using provided PCR conditions, then cleaned using Sera-Mag bead protocol with a final round of PCR using provided conditions. DNA was imaged on a 2% agarose gel to confirm successful amplification. Data includes location, start and end times, number of samples taken and IDs, basic environmental conditions, names of people involved in collection.
Seine/Electrofishing data: Conventional surveys performed by DFO employees and the type/number of species detected through each of the surveys were recorded.
Drained data: Ponds were drained by DFO employees and all fish were collected after fully drained. Type and number of species was recorded during this cleanout.
