Data from: Large-scale eDNA sampling and hierarchical modeling elucidates the importance of stream habitat for eastern hellbender (Cryptobranchus a. alleganiensis) occupancy and eDNA detection
Data files
Sep 09, 2025 version files 259.76 KB
-
covariates_sample_80sites.csv
7.72 KB
-
covariates_sample_90sites.csv
10.68 KB
-
covariates_sample_means.csv
4.92 KB
-
covariates_sample.csv
5.38 KB
-
covariates_site_landscape_90sites.csv
6.85 KB
-
covariates_site_no_nas.csv
2.50 KB
-
eDNAocc_45sites.csv
9.04 KB
-
eDNAocc_fulldata__90sites.csv
20.44 KB
-
eDNAocc_fulldata_80sites.csv
17.89 KB
-
Habitat_SampleInfo_p.csv
10.37 KB
-
Habitat_SampleInfo_Theta.csv
10.37 KB
-
Habitat_SiteInfo_Psi.csv
8.60 KB
-
Landscape_SampleInfo_p.csv
7.26 KB
-
Landscape_SampleInfo_Theta.csv
7.27 KB
-
Landscape_SiteInfo_Psi.csv
9.83 KB
-
LOD_LOQ_Cq_Standards_Sample1.csv
16.34 KB
-
LOD_LOQ_Cq_Standards_Sample2.csv
16.04 KB
-
par_estimate_habitat.csv
1.68 KB
-
par_estimate_landscape.csv
1.62 KB
-
par_estimate_waterchem.csv
1.39 KB
-
README.md
38.38 KB
-
Waterchem_SampleInfo_p.csv
18.49 KB
-
Waterchem_SampleInfo_Theta.csv
18.38 KB
-
Waterchem_SiteInfo_Psi.csv
8.33 KB
Abstract
Accurate detection data are imperative to assess distributions and habitat associations for species of conservation need. Environmental DNA (eDNA) sampling is an effective tool to obtain detection data across large geographic scales; however, most eDNA studies do not account for environmental variation that could influence detection. Hierarchical modeling can be used to identify factors important to species occurrence while accounting for such factors. Local extirpations and significant population declines have been documented across the range of the Eastern Hellbender (Cryptobranchus alleganiensis alleganiensis) due to water quality and habitat degradation, but a paucity of information on the current distribution and status of hellbenders remains for certain regions. We conducted a state-wide eDNA survey to 1) investigate the current distribution of hellbenders in Kentucky, a state that lacks extensive hellbender occurrence information, 2) evaluate habitat associations for hellbenders in this region, and 3) identify environmental factors that influence eDNA detection. Environmental DNA was collected from 90 sites state-wide, 27 of which had historic records. We ran multiscale Bayesian occupancy models to determine occupancy and detection probabilities at each site, and to identify water chemistry, local habitat, and landscape factors associated with hellbender occupancy and eDNA detection. Hellbender eDNA was detected at 22 sites total and at 12 (44%) historic locations. We found that total organic carbon in the stream significantly hindered eDNA detection and that local habitat quality was more important for hellbender presence than water chemistry or upstream catchment land cover. Hellbender occupancy was positively associated with the percent cobble, gravel, and bedrock in the streambed and stream order, and negatively associated with the percent fine sediment in the streambed. Our results indicate that hellbender populations have significantly declined in Kentucky, and the quality of available stream substrate is critical for hellbender presence. This study demonstrates that by applying hierarchical modeling to large-scale eDNA sampling, we were able to make robust inferences into factors associated with hellbender occurrence and eDNA detection.
https://doi.org/10.5061/dryad.9w0vt4bs2
Description of the data and file structure
This dataset contains the data and R code required to replicate analyses in Tomke and Price "Large-scale eDNA sampling and hierarchical modeling elucidates the importance of stream habitat for Eastern Hellbender (Cryptobranchus a. alleganiensis) occupancy and eDNA detection", which uses state-wide eDNA data to assess the distribution of the Eastern Hellbender salamander and hierarchical Bayesian occupancy modeling to identify water chemistry, habitat, and landscape-level covariates associated with hellbender occupancy and eDNA detection. Environmental DNA detections were derived from spatial and temporal eDNA replicates collected from 90 sites across Kentucky, USA during 2019-2021. Water chemistry and hydrology information were collected during each site visit, including seasonal sampling periods. A stream habitat assessment was conducted at each site during the Breeding sampling period in September. Landscape-level covariates including landcover composition within the upstream catchment of each site, stream order, and ecoregion were obtained by using ArcGIS. eDNA samples were filtered in the lab <10 hours after being collected from the field and frozen at -20C (without buffer) until DNA extraction was performed. Qiagen DNeasy Blood & Tissue kits with the addition of a Qiagen Qiashredder spin column were used to extraction DNA from filters (Spear et al. 2015). Samples were analyzed by using 9 qPCR replicates per sample. eDNA concentration was estimated within QuantStudio Design and Analysis Software based on 10x-fold dilution standard curves. The Limit of Detection (LOD) and Limit of Quantification (LOQ) was calculated using the R package drc (Klymus et al. 2020). We estimated the probability of hellbender occupancy, eDNA detection within a field sample, eDNA detection within a qPCR relicate, and identified covariates associated with each model parameter by using single season, single species, multi-scale Bayesian occupancy models in the msocc package in R (Stratton et al. 2020). Post-hoc analyses to investigate the effect of sampling effort at each level (field collection and PCR) on the precision of each model parameter was conducted using the RShiny App available at https://christianstratton.shinyapps.io/eDNAapp/ (Stratton et al. 2020). False positive detection rates were calculated by using the RShiny App eDNA 1.0 available at https://seak.shinyapps.io/eDNA/ (Diana et al. 2021).
R code for calculating LOD and LOQ is available in the file 'Dryad R Code*_*LOD_LOQ_Calculations.R'.
R code for all occupancy models and specific directions for using the RShiny apps to conduct post-hoc precision analyses and assess false positive detection rates is available in the file 'Dryad R Code_msocc Occupancy Modeling.R'
Files and variables
File: covariates_sample.csv
Description: Sample-level covariates for the Water Chemistry occupancy model. The file includes water chemistry and hydrology data for 45 sites that were sampled three times during the year.
Variables
- site: Abbreviation for 45 sites sampled for hellbender eDNA
- sample: '1', '2', and '3' representing temporal sampling periods. 1 = Breeding season, 2 = Hatching season, 3 = Pre-breeding season
- conductivity: Conductivity of water sample (umohs/L)
- toc: Level of Total Organic Carbon in water sample (mg/L C)
- turb: Turbidity of water sample (NTU)
- ph: pH of water (H+)
- temp: Surface temperature of water (C)
- flow: Flow velocity of water (m/s) based on a standardized flow method
- date: Julian date when sample was collected
File: covariates_sample_means.csv
Description: Sample-level covariates for the Water Chemistry occupancy model. The file includes water chemistry and hydrology data for 45 sites that were sampled three times during the year; mean and median values are reported.
Variables
- site: Abbreviation for 45 sites sampled for hellbender eDNA
- cond_mean: Mean conductivity of water sample (umohs/L) across three temporal sampling periods
- cond_median: Median conductivity of water sample (umohs/L) across three temporal sampling periods
- toc_mean: Mean level of Total Organic Carbon in water sample (mg/L C) across three temporal sampling periods
- toc_median: Median level of Total Organic Carbon in water sample (mg/L C) across three temporal sampling periods
- turb_mean: Mean turbidity of water sample (NTU) across three temporal sampling periods
- turb_median: Median turbidity of water sample (NTU) across three temporal sampling periods
- ph_mean: Mean pH of water (H+) across three temporal sampling periods
- ph_median: Median pH of water (H+) across three temporal sampling periods
- temp_mean: Mean surface temperature of water (C) across three temporal sampling periods
- temp_median: Median surface temperature of water (C) across three temporal sampling periods
- flow_mean: Mean flow velocity of water (m/s) across three temporal sampling periods
- flow_median: Median flow velocity of water (m/s) across three temporal sampling periods
- date_mean: Mean julian date when sample was collected across three temporal sampling periods
- date_median: Meidan julian date when sample was collected across three temporal sampling periods
File: eDNAocc_45sites.csv
Description: eDNA detection history at 45 sites that were sampled three times throughout the year. Three spatial eDNA replicate samples were collected at 33 of these sites. All eDNA samples were analyzed with 9 qPCR replicates and detection results for the three spatial replicates were concatonated into a single row. File used in the Water Chemistry occupancy model.
Variables
- site: Abbreviation for 45 sites sampled for hellbender eDNA
- sample: '1', '2', and '3' representing temporal sampling periods. 1 = Breeding season, 2 = Hatching season, 3 = Pre-breeding season
- pcr1.1: eDNA result for qPCR replicate 1 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.2: eDNA result for qPCR replicate 2 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.3: eDNA result for qPCR replicate 3 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.4: eDNA result for qPCR replicate 4 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.5: eDNA result for qPCR replicate 5 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.6: eDNA result for qPCR replicate 6 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.7: eDNA result for qPCR replicate 7 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.8: eDNA result for qPCR replicate 8 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.9: eDNA result for qPCR replicate 9 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.1: eDNA result for qPCR replicate 1 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.2: eDNA result for qPCR replicate 2 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.3: eDNA result for qPCR replicate 3 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.4: eDNA result for qPCR replicate 4 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.5: eDNA result for qPCR replicate 5 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.6: eDNA result for qPCR replicate 6 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.7: eDNA result for qPCR replicate 7 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.8: eDNA result for qPCR replicate 8 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.9: eDNA result for qPCR replicate 9 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.1: eDNA result for qPCR replicate 1 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.2: eDNA result for qPCR replicate 2 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.3: eDNA result for qPCR replicate 3 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.4: eDNA result for qPCR replicate 4 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.5: eDNA result for qPCR replicate 5 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.6: eDNA result for qPCR replicate 6 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.7: eDNA result for qPCR replicate 7 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.8: eDNA result for qPCR replicate 8 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.9: eDNA result for qPCR replicate 9 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
File: Waterchem_SiteInfo_Psi.csv
Description: Median and mean posterior probabilities of psi, the probability of hellbender occupancy at each site, with 95% credible intervals. File used to graph posterior distributions of covariates at each site.
Variables
- site: Abbreviation for 45 sites sampled for hellbender eDNA
- median: Median posterior probability of psi
- mean: Mean posterior probability of psi
- 0.025: 2.5% quantile around the mean posterior probability of psi
- 0.975: 97.5% quantile around the mean posterior probability of psi
- cond_mean: Mean conductivity of water sample (umohs/L) across three temporal sampling periods
- cond_median: Median conductivity of water sample (umohs/L) across three temporal sampling periods
- toc_mean: Mean level of Total Organic Carbon in water sample (mg/L C) across three temporal sampling periods
- toc_median: Median level of Total Organic Carbon in water sample (mg/L C) across three temporal sampling periods
- turb_mean: Mean turbidity of water sample (NTU) across three temporal sampling periods
- turb_median: Median turbidity of water sample (NTU) across three temporal sampling periods
- ph_mean: Mean pH of water (H+) across three temporal sampling periods
- ph_median: Median pH of water (H+) across three temporal sampling periods
- temp_mean: Mean surface temperature of water (C) across three temporal sampling periods
- temp_median: Median surface temperature of water (C) across three temporal sampling periods
- flow_mean: Mean flow velocity of water (m/s) across three temporal sampling periods
- flow_median: Median flow velocity of water (m/s) across three temporal sampling periods
- date_mean: Mean julian date when sample was collected across three temporal sampling periods
- date_median: Meidan julian date when sample was collected across three temporal sampling periods
File: Waterchem_SampleInfo_p.csv
Description: Median and mean posterior probabilities of p, the probability of detecting hellbender eDNA within a qPCR replicate, with 95% credible intervals. File used to graph posterior distributions of covariates at each site.
Variables
- id: Sample ID composed of site abbreviation and temporal sampling period
- site.x: Abbreviation for 45 sites sampled for hellbender eDNA
- sample.x: '1', '2', and '3' representing temporal sampling periods. 1 = Breeding season, 2 = Hatching season, 3 = Pre-breeding season
- rep: Not informative - all 1's
- median: Median posterior probability of p
- mean: Mean posterior probability of p
- 0.025: 2.5% quantile around the mean posterior probability of p
- 0.975: 97.5% quantile around the mean posterior probability of p
- site.y: Abbreviation for 45 sites sampled for hellbender eDNA
- sample.y: '1', '2', and '3' representing temporal sampling periods. 1 = Breeding season, 2 = Hatching season, 3 = Pre-breeding season
- conductivity: Conductivity of water sample (umohs/L)
- toc: Level of Total Organic Carbon in water sample (mg/L C)
- turb: Turbidity of water sample (NTU)
- ph: pH of water (H+)
- temp: Surface temperature of water (C)
- flow: Flow velocity of water (m/s) based on a standardized flow method
- date: Julian date when sample was collected
File: Waterchem_SampleInfo_Theta.csv
Description: Median and mean posterior probabilities of theta, the probability of detecting hellbender eDNA within an environmental sample, with 95% credible intervals. File used to graph posterior distributions of covariates at each site.
Variables
- id: Sample ID composed of site abbreviation and temporal sampling period
- site.x: Abbreviation for 45 sites sampled for hellbender eDNA
- sample.x: '1', '2', and '3' representing temporal sampling periods. 1 = Breeding season, 2 = Hatching season, 3 = Pre-breeding season
- rep: Not informative - all 1's
- median: Median posterior probability of theta
- mean: Mean posterior probability of theta
- 0.025: 2.5% quantile around the mean posterior probability of theta
- 0.975: 97.5% quantile around the mean posterior probability of theta
- site.y: Abbreviation for 45 sites sampled for hellbender eDNA
- sample.y: '1', '2', and '3' representing temporal sampling periods. 1 = Breeding season, 2 = Hatching season, 3 = Pre-breeding season
- conductivity: Conductivity of water sample (umohs/L)
- toc: Level of Total Organic Carbon in water sample (mg/L C)
- turb: Turbidity of water sample (NTU)
- ph: pH of water (H+)
- temp: Surface temperature of water (C)
- flow: Flow velocity of water (m/s) based on a standardized flow method
- date: Julian date when sample was collected
File: par_estimate_waterchem.csv
Description: Mean posterior probabilities and quantiles of the posterior distribution for covariates associated with psi, the probability that a site is occupied by a hellbender; theta, the probability of detecting eDNA within an environmental sample; and p, the probability of detecting eDNA within a qPCR replicate in the Water Chemistry occupancy model. File used to graph the posterior distributions.
Variables
- covariate: Water chemistry covariate included in the model
- mean: Mean posterior probability
- q2.5: 2.5% quantile around the mean posterior probability
- q7.5: 7.5% quantile around the mean posterior probability
- q10: 10% quantile around the mean posterior probability
- q25: 25% quantile around the mean posterior probability
- q50: 50% quantile around the mean posterior probability
- q75: 75% quantile around the mean posterior probability
- q90: 90% quantile around the mean posterior probability
- q92.5: 92.5% quantile around the mean posterior probability
- q97.5: 97.5% quantile around the mean posterior probability
- parameter: Model parameter: psi, theta, or p
File: eDNAocc_fulldata_80sites.csv
Description: eDNA detection history at 80 sites. Temporal eDNA replicates were collected from 45 sites. Three spatial eDNA replicate samples were collected at 34 sites. All eDNA samples were analyzed with 9 qPCR replicates and detection results for the three spatial replicates were concatonated into a single row. File used in the Habitat occupancy model.
Variables
- site: Abbreviation for 80 sites sampled for hellbender eDNA
- sample: '1', '2', and '3' representing temporal sampling periods. 1 = Breeding season, 2 = Hatching season, 3 = Pre-breeding season
- pcr1.1: eDNA result for qPCR replicate 1 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.2: eDNA result for qPCR replicate 2 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.3: eDNA result for qPCR replicate 3 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.4: eDNA result for qPCR replicate 4 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.5: eDNA result for qPCR replicate 5 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.6: eDNA result for qPCR replicate 6 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.7: eDNA result for qPCR replicate 7 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.8: eDNA result for qPCR replicate 8 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.9: eDNA result for qPCR replicate 9 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.1: eDNA result for qPCR replicate 1 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.2: eDNA result for qPCR replicate 2 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.3: eDNA result for qPCR replicate 3 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.4: eDNA result for qPCR replicate 4 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.5: eDNA result for qPCR replicate 5 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.6: eDNA result for qPCR replicate 6 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.7: eDNA result for qPCR replicate 7 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.8: eDNA result for qPCR replicate 8 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.9: eDNA result for qPCR replicate 9 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.1: eDNA result for qPCR replicate 1 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.2: eDNA result for qPCR replicate 2 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.3: eDNA result for qPCR replicate 3 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.4: eDNA result for qPCR replicate 4 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.5: eDNA result for qPCR replicate 5 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.6: eDNA result for qPCR replicate 6 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.7: eDNA result for qPCR replicate 7 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.8: eDNA result for qPCR replicate 8 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.9: eDNA result for qPCR replicate 9 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
File: covariates_site_no_nas.csv
Description: Site-level covariates for the Habitat occupancy model. The file includes habitat data derived from a stream habitat assessment conducted at each site in September.
Variables
- site: Abbreviation for 80 sites sampled for hellbender eDNA
- perc_trees: The proportion of riparian zone with trees as cover type
- perc_bareground: The proportion of riparian zone with bareground
- perc_bedrock: The proportion of stream substrate composed of bedrock
- perc_boulder: The proportion of stream substrate composed of boulders (<250 mm)
- perc_cobble: The proportion of stream substrate composed of cobble (60-250 mm)
- perc_gravel: The proportion of stream substrate composed of gravel (2-60 mm)
- perc_sand: The proportion of stream substrate composed of sand (0.06-2 mm)
- perc_fine: The proportion of stream substrate composed of fine sediment (<0.06 mm)
- ripar_quant: Scored 0-10; extent of natural plant community from stream channel
- nutrientload: Scored 0-10; presence of algal growth
File: covariates_sample_80sites.csv
Description: Sample-level covariates for the Habitat occupancy model. The file includes habitat data derived from a stream habitat assessment conducted at each site in September.
Variables
- site: Abbreviation for 80 sites sampled for hellbender eDNA
- perc_trees: The proportion of riparian zone with trees as cover type
- perc_bareground: The proportion of riparian zone with bareground
- perc_bedrock: The proportion of stream substrate composed of bedrock
- perc_boulder: The proportion of stream substrate composed of boulders (<250 mm)
- perc_cobble: The proportion of stream substrate composed of cobble (60-250 mm)
- perc_gravel: The proportion of stream substrate composed of gravel (2-60 mm)
- perc_sand: The proportion of stream substrate composed of sand (0.06-2 mm)
- perc_fine: The proportion of stream substrate composed of fine sediment (<0.06 mm)
- ripar_quant: Scored 0-10; extent of natural plant community from stream channel
- nutrientload: Scored 0-10; presence of algal growth
File: Habitat_SampleInfo_p.csv
Description: Median and mean posterior probabilities of p, the probability of detecting hellbender eDNA within a qPCR replicate, with 95% credible intervals. File used to graph posterior distributions of covariates at each site.
Variables
- id: Sample ID composed of site abbreviation and temporal sampling period
- site.x: Abbreviation for 80 sites sampled for hellbender eDNA
- sample.x: Not informative - all 1's
- rep: Not informative - all 1's
- median: Median posterior probability of p
- mean: Mean posterior probability of p
- 0.025: 2.5% quantile around the mean posterior probability of p
- 0.975: 97.5% quantile around the mean posterior probability of p
- site.y: Abbreviation for 80 sites sampled for hellbender eDNA
- sample.y: Not informative - all 1's
- perc_trees: The proportion of riparian zone with trees as cover type
- perc_bareground: The proportion of riparian zone with bareground
- perc_bedrock: The proportion of stream substrate composed of bedrock
- perc_boulder: The proportion of stream substrate composed of boulders (<250 mm)
- perc_cobble: The proportion of stream substrate composed of cobble (60-250 mm)
- perc_gravel: The proportion of stream substrate composed of gravel (2-60 mm)
- perc_sand: The proportion of stream substrate composed of sand (0.06-2 mm)
- perc_fine: The proportion of stream substrate composed of fine sediment (<0.06 mm)
- ripar_quant: Scored 0-10; extent of natural plant community from stream channel
- nutrientload: Scored 0-10; presence of algal growth
File: Habitat_SampleInfo_Theta.csv
Description: Median and mean posterior probabilities of theta, the probability of detecting hellbender eDNA within an environmental sample, with 95% credible intervals. File used to graph posterior distributions of covariates at each site.
Variables
- id: Sample ID composed of site abbreviation and temporal sampling period
- site.x: Abbreviation for 80 sites sampled for hellbender eDNA
- sample.x: Not informative - all 1's
- rep: Not informative - all 1's
- median: Median posterior probability of p
- mean: Mean posterior probability of p
- 0.025: 2.5% quantile around the mean posterior probability of p
- 0.975: 97.5% quantile around the mean posterior probability of p
- site.y: Abbreviation for 80 sites sampled for hellbender eDNA
- sample.y: Not informative - all 1's
- perc_trees: The proportion of riparian zone with trees as cover type
- perc_bareground: The proportion of riparian zone with bareground
- perc_bedrock: The proportion of stream substrate composed of bedrock
- perc_boulder: The proportion of stream substrate composed of boulders (<250 mm)
- perc_cobble: The proportion of stream substrate composed of cobble (60-250 mm)
- perc_gravel: The proportion of stream substrate composed of gravel (2-60 mm)
- perc_sand: The proportion of stream substrate composed of sand (0.06-2 mm)
- perc_fine: The proportion of stream substrate composed of fine sediment (<0.06 mm)
- ripar_quant: Scored 0-10; extent of natural plant community from stream channel
- nutrientload: Scored 0-10; presence of algal growth
File: Habitat_SiteInfo_Psi.csv
Description: Median and mean posterior probabilities of psi, the probability of hellbender occupancy at each site, with 95% credible intervals. File used to graph posterior distributions of covariates at each site.
Variables
- site: Abbreviation for 80 sites sampled for hellbender eDNA
- median: Median posterior probability of psi
- mean: Mean posterior probability of psi
- 0.025: 2.5% quantile around the mean posterior probability of psi
- 0.975: 97.5% quantile around the mean posterior probability of psi
- perc_trees: The proportion of riparian zone with trees as cover type
- perc_bareground: The proportion of riparian zone with bareground
- perc_bedrock: The proportion of stream substrate composed of bedrock
- perc_boulder: The proportion of stream substrate composed of boulders (<250 mm)
- perc_cobble: The proportion of stream substrate composed of cobble (60-250 mm)
- perc_gravel: The proportion of stream substrate composed of gravel (2-60 mm)
- perc_sand: The proportion of stream substrate composed of sand (0.06-2 mm)
- perc_fine: The proportion of stream substrate composed of fine sediment (<0.06 mm)
- ripar_quant: Scored 0-10; extent of natural plant community from stream channel
- nutrientload: Scored 0-10; presence of algal growth
File: par_estimate_habitat.csv
Description: Mean posterior probabilities and quantiles of the posterior distribution for covariates associated with psi, the probability that a site is occupied by a hellbender; theta, the probability of detecting eDNA within an environmental sample; and p, the probability of detecting eDNA within a qPCR replicate from the Habitat occupancy model. File used to graph the posterior distributions.
Variables
- covariate: Habitat covariate included in the model
- mean: Mean posterior probability
- q2.5: 2.5% quantile around the mean posterior probability
- q7.5: 7.5% quantile around the mean posterior probability
- q10: 10% quantile around the mean posterior probability
- q25: 25% quantile around the mean posterior probability
- q50: 50% quantile around the mean posterior probability
- q75: 75% quantile around the mean posterior probability
- q90: 90% quantile around the mean posterior probability
- q92.5: 92.5% quantile around the mean posterior probability
- q97.5: 97.5% quantile around the mean posterior probability
- parameter: Model parameter: psi, theta, or p
File: covariates_sample_90sites.csv
Description: Sample-level covariates for the Landscape occupancy model.
Variables
- site: Abbreviation for 90 sites sampled for hellbender eDNA
- sample: '1', '2', and '3' representing temporal sampling periods. 1 = Breeding season, 2 = Hatching season, 3 = Pre-breeding season
- conductivity: Conductivity of water sample (umohs/L)
- toc: Level of Total Organic Carbon in water sample (mg/L C)
- turb: Turbidity of water sample (NTU)
- ph: pH of water (H+)
- temp: Surface temperature of water (C)
- flow: Flow velocity of water (m/s) based on a standardized flow method
- volume: The volume of water (L) collected at each site
- stream_order: Strahler stream order
File: covariates_site_landscape_90sites.csv
Description: Site-level covariates for the Landscape occupancy model. Ecoregion, stream order, and landcover composition was obtained using ArcGIS.
Variables
- site: Abbreviation for 90 sites sampled for hellbender eDNA
- ecoregion: Level III ecoregion
- prop_water: The proportion of the upstream catchment (200 m X 20 km) in water
- prop_devel: The proportion of the upstream catchment (200 m X 20 km) in developed land
- prop_forest: The proportion of the upstream catchment (200 m X 20 km) in forested land
- prop_herb: The proportion of the upstream catchment (200 m X 20 km) in herbaceous land
- prop_agric: The proportion of the upstream catchment (200 m X 20 km) in agriculture
- watershed_area: Total area of the upstream catchment (200 m X 20 km), which included all tributaries upstream of the eDNA sampling location
- stream_order: Strahler stream order
- volume: The volume of water (L) collected at each site
File: eDNAocc_fulldata__90sites.csv
Description: eDNA detection history at 90 sites. Temporal eDNA replicates were collected from 45 sites. Three spatial eDNA replicate samples were collected at 34 sites. All eDNA samples were analyzed with 9 qPCR replicates and detection results for the three spatial replicates were concatonated into a single row. File used in the Landscape occupancy model.
Variables
- site: Abbreviation for 80 sites sampled for hellbender eDNA
- sample: '1', '2', and '3' representing temporal sampling periods. 1 = Breeding season, 2 = Hatching season, 3 = Pre-breeding season
- pcr1.1: eDNA result for qPCR replicate 1 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.2: eDNA result for qPCR replicate 2 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.3: eDNA result for qPCR replicate 3 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.4: eDNA result for qPCR replicate 4 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.5: eDNA result for qPCR replicate 5 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.6: eDNA result for qPCR replicate 6 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.7: eDNA result for qPCR replicate 7 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.8: eDNA result for qPCR replicate 8 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr1.9: eDNA result for qPCR replicate 9 from spatial replicate 1. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.1: eDNA result for qPCR replicate 1 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.2: eDNA result for qPCR replicate 2 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.3: eDNA result for qPCR replicate 3 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.4: eDNA result for qPCR replicate 4 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.5: eDNA result for qPCR replicate 5 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.6: eDNA result for qPCR replicate 6 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.7: eDNA result for qPCR replicate 7 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.8: eDNA result for qPCR replicate 8 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr2.9: eDNA result for qPCR replicate 9 from spatial replicate 2. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.1: eDNA result for qPCR replicate 1 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.2: eDNA result for qPCR replicate 2 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.3: eDNA result for qPCR replicate 3 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.4: eDNA result for qPCR replicate 4 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.5: eDNA result for qPCR replicate 5 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.6: eDNA result for qPCR replicate 6 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.7: eDNA result for qPCR replicate 7 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.8: eDNA result for qPCR replicate 8 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
- pcr3.9: eDNA result for qPCR replicate 9 from spatial replicate 3. 0 = no eDNA detected; 1 = eDNA detected
File: Landscape_SampleInfo_p.csv
Description: Median and mean posterior probabilities of p, the probability of detecting hellbender eDNA within a qPCR replicate, with 95% credible intervals. File used to graph posterior distributions of covariates at each site.
Variables
- site: Abbreviation for 90 sites sampled for hellbender eDNA
- sample: Not informative - All 1's
- conductivity: Conductivity of water sample (umohs/L). Averaged across three temporal samples when applicable.
- toc: Level of Total Organic Carbon in water sample (mg/L C). Averaged across three temporal samples when applicable.
- turb: Turbidity of water sample (NTU). Averaged across three temporal samples when applicable.
- ph: pH of water (H+). Averaged across three temporal samples when applicable.
- temp: Surface temperature of water (C). Averaged across three temporal samples when applicable.
- flow: Flow velocity of water (m/s) based on a standardized flow method. Averaged across three temporal samples when applicable.
- volume: The volume of water (L) collected at each site
- stream_order: Strahler stream order
- median: Median posterior probability of p
- mean: Mean posterior probability of p
- q0.025: 2.5% quantile around the mean posterior probability of p
- q0.975: 97.5% quantile around the mean posterior probability of p
File: Landscape_SampleInfo_Theta.csv
Description: Median and mean posterior probabilities of theta, the probability of detecting hellbender eDNA within an environmental sample, with 95% credible intervals. File used to graph posterior distributions of covariates at each site.
Variables
- site: Abbreviation for 90 sites sampled for hellbender eDNA
- sample: Not informative - All 1's
- conductivity: Conductivity of water sample (umohs/L). Averaged across three temporal samples when applicable.
- toc: Level of Total Organic Carbon in water sample (mg/L C). Averaged across three temporal samples when applicable.
- turb: Turbidity of water sample (NTU). Averaged across three temporal samples when applicable.
- ph: pH of water (H+). Averaged across three temporal samples when applicable.
- temp: Surface temperature of water (C). Averaged across three temporal samples when applicable.
- flow: Flow velocity of water (m/s) based on a standardized flow method. Averaged across three temporal samples when applicable.
- volume: The volume of water (L) collected at each site
- stream_order: Strahler stream order
- median: Median posterior probability of theta
- mean: Mean posterior probability of theta
- q0.025: 2.5% quantile around the mean posterior probability of theta
- q0.975: 97.5% quantile around the mean posterior probability of theta
File: Landscape_SiteInfo_Psi.csv
Description: Median and mean posterior probabilities of psi, the probability of hellbender occupancy at each site, with 95% credible intervals. File used to graph posterior distributions of covariates at each site.
Variables
- site: Abbreviation for 90 sites sampled for hellbender eDNA
- ecoregion: Level III ecoregion
- prop_water: The proportion of the upstream catchment (200 m X 20 km) in water
- prop_devel: The proportion of the upstream catchment (200 m X 20 km) in developed land
- prop_forest: The proportion of the upstream catchment (200 m X 20 km) in forested land
- prop_herb: The proportion of the upstream catchment (200 m X 20 km) in herbaceous land
- prop_agric: The proportion of the upstream catchment (200 m X 20 km) in agriculture
- watershed_area: Total area of the upstream catchment (200 m X 20 km), which included all tributaries upstream of the eDNA sampling location
- stream_order: Strahler stream order
- volume: The volume of water (L) collected at each site
- median: Median posterior probability of psi
- mean: Mean posterior probability of psi
- q0.025: 2.5% quantile around the mean posterior probability of psi
- q0.975: 97.5% quantile around the mean posterior probability of psi
File: par_estimate_landscape.csv
Description: Mean posterior probabilities and quantiles of the posterior distribution for covariates associated with psi, the probability that a site is occupied by a hellbender; theta, the probability of detecting eDNA within an environmental sample; and p, the probability of detecting eDNA within a qPCR replicate from the Landscape occupancy model. File used to graph the posterior distributions.
Variables
- covariate: Landscape covariate included in the model
- mean: Mean posterior probability
- q2.5: 2.5% quantile around the mean posterior probability
- q7.5: 7.5% quantile around the mean posterior probability
- q10: 10% quantile around the mean posterior probability
- q25: 25% quantile around the mean posterior probability
- q50: 50% quantile around the mean posterior probability
- q75: 75% quantile around the mean posterior probability
- q90: 90% quantile around the mean posterior probability
- q92.5: 92.5% quantile around the mean posterior probability
- q97.5: 97.5% quantile around the mean posterior probability
- parameter: Model parameter: psi, theta, or p
File: LOD_LOQ_Cq_Standards_Sample1.csv
Description: Quantitative PCR cycle threshold and starting concentration of standard curves used to calculate the limit of detection (LOD) and limit of quantification (LOQ) of hellbender DNA derived from tissue sample #1 (94.0 ng/ul).
Variables
- Plate: qPCR Plate identification number
- Well: PCR plate well
- Sample: Standard dilution labeled STD*_*10-1, STD_10-2, STD_10-3, STD_10-4, STD_10-5, or STD_10-6
- Cq: Cycle threshold
- SQ: Starting concentration (pg/ul)
- Target: Target name (HellB)
File: LOD_LOQ_Cq_Standards_Sample2.csv
Description: Quantitative PCR cycle threshold and starting concentration of standard curves used to calculate the limit of detection (LOD) and limit of quantification (LOQ) of hellbender DNA derived from tissue sample #2 (11.4 ng/ul).
Variables
- Plate: qPCR Plate identification number
- Well: PCR plate well
- Sample: Standard dilution labeled STD*_*10-1, STD_10-2, STD_10-3, STD_10-4, STD_10-5, or STD_10-6
- Cq: Cycle threshold
- SQ: Starting concentration (pg/ul)
- Target: Target name (HellB)
Code/software
The available R code includes all code relevant to the analyses and production of figures included in the manuscript. R (v 4.4.2) is required to run these analyses. All required packages are listed in the code provided.
Site Selection and Sampling Design
We collected water samples from 90 sites across 73 rivers in Kentucky from September 2019-June 2021 to test for the presence of hellbender eDNA (Fig. 1). Amongst these sites, 27 overlapped with or were within 4 km of a historic record—defined as having at least one hellbender observation verified by a biologist via photograph or verbal description within the past 50 years. Additionally, 10 sites were located within streams that had historic records but did not have an observation at the sampling location and 53 sites were within streams that lacked historic records. One historic site was included as a positive field control; it is the only known location in Kentucky where hellbenders are regularly observed and are actively breeding.
Once on location, each site was visually surveyed for high-quality habitat 500 meters up- and downstream from the historic record or the nearest river access point (i.e., boat launch, bridge crossing). We defined high-quality habitat as swift running, semi-shallow water located downstream of riffles with large rock-slabs, boulders, or bedrock shelving with crevices, and gravel or cobble substrate (Mayasich et al. 2003). An eDNA sample was collected directly downstream of the highest quality habitat available at all 90 sites during the autumn hellbender “breeding” season when eDNA levels are presumed to be highest due to the abundance of shed genetic material caused by aggressive male interactions and external fertilization of eggs (Spear et al. 2015, Takahashi et al. 2018). Sampling during this period occurred between September 2 and October 11, 2019-2021. Amongst the 90 eDNA sites, 45 were randomly selected for additional temporal sampling during the “hatching” (October 18 – November 19, 2019-2020) and “non-breeding” seasons (June 1-25, 2020-2021) to assess potential seasonal variation in eDNA detection. Eggs hatch during late October and early November, possibly increasing eDNA concentrations at this time; during the non-breeding season hellbenders are mostly sedentary, presumably resulting in lower eDNA levels relative to their more active periods.
Amongst the 45 temporally sampled sites, we collected eDNA samples at additional spatial replicates from 33 randomly selected sites to determine how eDNA concentrations vary at a fine spatial scale. Three spatial replicates were collected 1) directly downstream of the best available habitat consistent with the protocol used at all 90 sites, 2) 50 meters downstream of Replicate 1, and 3) 50 meters downstream of Replicate 2. Spatial replicates were taken at only one site where temporal replicates were not collected due to improper sampling technique at this location; thus, this site was not revisited during the hatching and non-breeding seasons. In summary, we sampled 45 sites once during the breeding season while temporal replicates were collected from 45 sites; one spatial replicate was collected at 56 sites while three spatial replicates were collected at 34; both temporal and spatial replicates were collected at 33 sites (Fig. 2).
eDNA Sample Collection
All water samples were collected from the thalweg of the stream or as deep as possible after disturbed sediment was allowed to settle. Two autoclaved 1L wide-mouth Nalgene bottles with lids attached were placed as close to the bottom of the stream as possible without disturbing the substrate, opened until filled, then resealed while still underwater. A 250 mL water sample was then collected for water chemistry analysis using the same technique. After collection, samples were immediately placed on ice and all equipment was cleaned with 10% bleach to prevent cross-contamination among sites. Negative field controls (i.e., 1L autoclaved DI water) were stored on ice alongside eDNA samples each day to monitor for cross-contamination.
Water samples were filtered at the lab within 10 hours after collection through a 0.47 µm mixed cellulose ester membrane (Whatman, Buckinghamshire, UK) by using a sterile Nalgene filter holder with funnel, 1L Nalgene filter flask, and Rocker 300, 60 Hz vacuum pump (Southern Labware, Georgia, USA). Negative field controls were filtered first to eliminate any possibility of false-positive detections due to cross-contamination via the filtration process. Following these, water samples from each site were filtered. When possible, the two 1L samples were filtered through the same membrane. However, samples often had high amounts of suspended sediment, which required multiple membranes to completely filter the two bottles. Lastly, a third negative control consisting of 1L autoclaved DI water was filtered to test for cross-contamination during filtration. For each control and field sample, filter membranes were removed from the filter apparatus, cut in half, and placed into two sterile 2 mL microcentrifuge tubes by using sterile scissors and tweezers. Filters were immediately frozen at -20°C until eDNA was extracted. One-half of each filter was processed for analysis, while the other half remained frozen as a backup. All filtering equipment was sterilized in 10% bleach and rinsed in DI water between samples and/or spatial replicates.
Site-Level Information
Stream habitat assessments were conducted at each site during the breeding season following a standardized protocol developed by the USDA (2009). The assessment incorporated the entire visible stream reach and estimated the composition of the stream substrate and riparian zone, and assessed the quality of hydrologic features (i.e., pools, riffles), available aquatic habitat, channel condition, and water appearance. We performed water chemistry analyses of the 250 mL water samples collected at the time of each site visit within the Forest Hydrology Lab at the University of Kentucky in accordance with standard methods (Greenberg et al. 1992). We estimated levels of turbidity, pH, conductivity, and total organic carbon (TOC). Surface water temperature was measured during each site visit using a digital water-proof thermometer. Flow velocity was also measured during each site visit and at each spatial replicate when appropriate by using a standardized float method (USGS 1982). Floats were conducted three times at each location and averaged for a final flow velocity estimate; if a site had spatial replicates, flow velocity was measured at each replicate and averaged for a single site-level estimate.
Land cover composition was quantified within a buffer zone upstream of each eDNA site using the 2019 National Land Cover Database in ArcGIS Desktop 10.8.2. The buffer zone encompassed a 200 m riparian zone on either side of the stream because this distance incorporates any habitat likely to affect the water quality and functional ecology of the local system (Lind et al. 2019). Land cover was quantified within the 200 m buffer for 20 km upstream of the eDNA sample site to incorporate the distance that eDNA can potentially travel before degrading to undetectable levels (Deiner and Altermatt 2014), as well as the distance fine-sediment travels before entering storage and, therefore, could potentially alter available hellbender habitat (Pizzuto 2014). Land cover was clustered into five major groups: open water & wetlands, developed & barren (i.e., developed), forest, shrub & herbaceous (i.e., herbaceous), and pasture & crop (i.e., agriculture). We also used ArcGIS to identify the level III ecoregion and Strahler stream order of each of our eDNA sites.
Snorkeling Surveys
We conducted 279 person hours (ph) of snorkeling surveys across 12 eDNA-positive sites during July-October, 2021-2022. These included our positive field control, which we used to train our staff and establish a standardized approach among surveyors. Surveys primarily consisted of day-time visual surveys (234 ph) in which dive lights or borescope cameras were used to search for hellbenders underneath large boulders or rock slabs and within bedrock crevices. Rock-flipping was also used to search for hellbenders during July and early August before the breeding season began. Additionally, 45 ph of snorkeling took place at night when hellbenders are most active and visibility is better.
Laboratory Methods
We extracted DNA from filter membranes following methods described by Spear et al. (2015). Briefly, filters were cut into small pieces and extracted using DNeasy Blood and Tissue Kits (Qiagen) following guidelines provided by the manufacturer with the additional use of a Qiashredder spin column (Qiagen) to facilitate degradation of the filter membrane, a 10 min incubation at 70ºC after the addition of AL buffer, and final elution of AE buffer warmed to 70ºC. Additionally, if multiple filters were needed to complete the filtration of a water sample, we combined these during the binding step of DNA extraction (Takahashi et al. 2018); briefly, the lysate derived from each filter was passed through the same spin column before being washed with AW1 buffer. One blank filter was incorporated into each round of DNA extraction to monitor for cross-contamination during DNA extraction.
We amplified eDNA samples at a 104 bp sequence of the mitochondrial cytochrome b region using nine quantitative PCR (qPCR) replicates per sample with primers and probes designed by Spear et al. (2015; Table S1) and manufactured by ThermoFisher Scientific Inc (Massachusetts, USA). We ran 15µL reactions on a QuantStudio3 thermocycler (ThermoFisher Scientific Inc) with 7.5 µL TaqMan Environmental Master Mix 2.0 (ThermoFisher Scientific Inc), 0.6 µL (0.4 µM concentration) of each primer, 0.3 µL (0.2 µM concentration) of probe, 0.6 µL TaqMan Exogenous Internal Positive Control 10X Exo IPC Mix (ThermoFisher Scientific Inc), 0.3 µL TaqMan Exogenous Internal Positive Control 50X Exo IPC DNA, 0.12 µL (0.4 µg/µL concentration) BSA, 1.98 µL nuclease-free water, and 3 µL of eDNA template. TaqMan Exogenous Internal Positive Controls were added to each sample reaction to monitor for inhibition. Thermo-cycling parameters followed those recommended for use with TaqMan Environmental Master Mix 2.0 and began with 10 min at 95°C followed by 45 cycles of 95°C for 15 s and 60°C for 1 min. Each plate contained three positive template controls obtained by extracting eDNA from aquarium tank water housing adult hellbenders and three negative template controls consisting of DI water. Triplicates of a standard dilution series composed of five or six 10x-fold dilutions ranging from 9.4-9.4x10-5 ng/µL or 1.14-1.14x10-5 ng/µL were also included on each plate. These dilutions cover the range of DNA concentrations typically seen with eDNA extractions (Spear et al. 2015). Two hellbender tissue samples (94.0 and 11.4 ng/µL) were needed to develop our standard curves because all of the extracted DNA from our first sample was used in this study. DNA extraction, qPCR prep, and qPCR were conducted in separate rooms dedicated for each task to reduce the risk of cross-contamination.
eDNA Analysis
We used QuantStudio Design and Analysis Software v1.5.1 (ThermoFisher Scientific Inc) to assess amplification results and considered a sample positive when at least one of the nine qPCR replicates amplified, the curve morphology was uniform with the standard replicates, and the negative qPCR controls showed no amplification (Klymus et al. 2020). To test the efficacy of using a single positive detection threshold, we conducted two calibration experiments to estimate the rate of false-positive detections that may be introduced at the water sample and PCR level (Guillera-Arroita et al. 2017). We extracted and amplified DNA at the hellbender cytochrome b region for 20 250mL eDNA samples collected from a second order stream where hellbenders are absent and 20 250mL DI water samples; one qPCR replicate was run per sample. We also ran 40 no template control qPCR replicates using RO water to estimate the rate of spurious PCR amplifications. All 80 qPCR replicates lacked amplification, which provided confidence in our single positive detection approach.
LOD/LOQ
We calculated the limit of detection (LOD) and limit of quantification (LOQ) of our standard curves to assess the ability of our assay to detect and quantify the target sequence at low levels. We define the LOD as the lowest concentration of target analyte that can be detected with a 95% confidence level and the LOQ as the lowest amount of analyte in a sample that can be quantitatively determined with a coefficient of variation value below 35%. Calculations were performed separately for standard curves derived from the two different tissue samples. For tissue sample 1 (94.0 ng/µL), calculations were based on 25 replicate standard curves with six 10x-fold dilutions ranging from 9.4 ng/µL–9.4x10-5 ng/µL. For tissue sample 2 (11.4 ng/µL), calculations were based on five replicate curves with six 10x-fold dilutions ranging from 11.4–1.14x10-4 ng/µL, 20 replicate curves with five 10x-fold dilutions ranging from 1.14–1.14x10-4 ng/µL, and two replicate curves with six 10x-fold dilutions ranging from 1.14–1.14x10-5 ng/µL. The LOD and LOQ for each sample were determined by using a curve-fitting approach with a Weibull Type II logarithmic function and a 4th order polynomial, respectively. Calculations were performed in the R package drc after outliers were removed using code developed by Klymus et al. (2020).
Occupancy Model Development
We applied a single species, single season, multi-scale SODM framework to estimate hellbender occupancy and eDNA detection probabilities, and investigate factors associated with each level. We used the Bayesian multi-scale model described by Dorazio and Erickson (2018) and implemented in the R package msocc (Stratton et al. 2020). This model allows for the estimation of species occurrence with three levels of hierarchical sampling, while accounting for imperfect detection. The models were used to estimate 1) ψ, the probability that a site is occupied by C. a. alleganiensis, 2) θ, the probability of detecting eDNA in an environmental sample given that the site is occupied, and 3) p, the probability of detecting eDNA in a PCR replicate given that eDNA is present in the environmental sample and the site is occupied. Predictor variables were included at each level to determine their effect on the parameter of interest.
We developed three global models to evaluate the effects of water chemistry, habitat, and landscape attributes on estimates of hellbender occupancy and eDNA detection probabilities (Table 1). Three separate models were developed because our study design resulted in unequal sampling effort across our sites (see Site Selection and Sampling Design). Additionally, inherent differences in stream characteristics (i.e., stream depth, stream visibility) prevented some covariate data from being collected at all sites. Consequently, we partitioned our predictor variables into three categories based on sample size, which led to the three occupancy models.
Our “water chemistry” model included the 45 sites where temporal eDNA replicates were taken. Predictor variables for this model largely consisted of water chemistry variables that we hypothesized would influence hellbender presence at a site (i.e., conductivity, turbidity, flow velocity, temperature), eDNA transportation and degradation in a lotic system (i.e., TOC, pH, flow velocity, turbidity), and PCR inhibition (i.e., TOC, turbidity; Table 1). Conductivity has been found to be negatively associated with hellbender presence due to its impact on sperm mobility in amphibians (Pitt et al. 2017), and pristine stream characteristics such as clear water, fast flow, and cool temperatures are classically associated with high quality hellbender habitat (Nickerson and Mays 1973, Mayasich et al. 2003). Turbidity and TOC were included as covariates for both θ and p since suspended sediments are known to bind to eDNA fragments, potentially making them unavailable for field capture or PCR (Jerde et al. 2016, Stoeckle et al. 2017, Kumar et al. 2022), and various factors comprising TOC and associated with the biotic environment (i.e., microbes, biofilms, humic and fulvic acids) increase eDNA degradation rates and inhibit PCR enzymes (Jane et al. 2015, Shogren et al. 2018, Uchii et al. 2019, Zulkefli et al. 2019). Julian date was also included as a covariate for θ to account for potential seasonal variations in eDNA abundance caused by changes in breeding condition or larval hatching.
The “habitat” occupancy model included covariates descriptive of the local habitat of the stream reach where eDNA samples were collected (Table 1). These covariates were derived from the stream habitat assessment conducted during the initial visit to each site in September and early October. The model included 80 sites where stream habitat assessments were fully completed; ten sites were located in rivers that were too deep or turbid to accurately assess the substrate composition and consequently could not be included in this particular model. We included habitat covariates commonly associated with hellbender presence (% boulder, % cobble, % gravel, etc.; Nickerson and Mays 1973) and absence (% fine sediment, % sand; Pugh et al. 2016). Nutrient load was also included in the model as an indicator of algae and microbial activity, which can increase eDNA degradation rates and, therefore, affect detection probabilities (Goldberg et al. 2016). We also included the percent fine sediment and sand within the stream substrate as covariates for θ because eDNA can bind to fine particles making it unavailable for capture (Turner et al. 2015, Stoeckle et al. 2017). Further, the percent fine sediment was included as a covariate for p because excess sedimentation can cause PCR inhibition (Stoeckle et al. 2017).
The “landscape” occupancy model included large-scale covariates such as landcover composition within the stream buffer, ecoregion, and stream order, and included all 90 sites (Table 1). Watersheds with intact vegetation and/or forest cover generally have better stream conditions and water quality, and are often associated with hellbender presence and population persistence (Pugh et al. 2016, Bodinof Jachowski and Hopkins 2018, Unger et al. 2021). In contrast, deforestation, mining, and urban development of land significantly impact hellbender populations and can cause local extirpations (Nickerson et al. 2017, Wineland et al. 2019b, Da Silva Neto 2020). Ecoregion was found to be an important predictor for hellbender occupancy in Tennessee due to differences in geology, stream hydrology, and historic anthropogenic practices among ecoregions (Da Silva Neto 2020). Given the similarity in geological gradients and land-use patterns between Tennessee and Kentucky, we believed ecoregion may be an informative factor for hellbender occupancy in this state as well and included it in the model. We also included stream order as covariates for ψ and θ since hellbenders typically occupy Strahler stream order 2 or higher (Da Silva Neto 2020) and the hydrological conditions of a stream may influence the ability to capture eDNA (Deiner and Altermatt 2014). Lastly, we included the total volume of water collected at each site as a covariate for θ as we would expect that greater sampling effort would increase the probability of detecting eDNA when it is available (Furlan et al. 2016, Hunter et al. 2019).
To adhere our four-level sampling design (i.e., site, temporal replicates, spatial replicates, PCR replicates) to a three-level hierarchical framework, we condensed the spatial replicate detection histories into a single row such that the temporal replicates served as our repeated samples. Consequently, if a site was visited only once during the breeding season and no spatial replicates were collected, the detection matrix for that site was 1x9 (n=9; one temporal replicate with 9 qPCR replicates; Fig 2). If both temporal and spatial replicates were collected at a site, the detection matrix was 3x27 (n=81; three temporal replicates with three spatial replicates each and 9 qPCR replicates per spatial replicate; Fig 2). We tested for collinearity among predictor variables in each model using the corSelect function in the R package fuzzySim (Barbosa 2015) and a default correlation coefficient of 0.8. The proportion of agricultural land use was highly correlated with forest cover (r=0.85, p>0.001) and was therefore not included in the landscape occupancy model. All remaining continuous predictor variables were standardized to z-scores.
All models were run with a single MCMC chain and default standard normal (Gaussian) prior distributions for each parameter. We ran a null model using detection data from all 90 sites that did not contain any covariates to estimate occupancy and eDNA detection probabilities at the water sample and qPCR levels. We ran the null model with 20,000 iterations and a 3,000 burn-in. The water chemistry model was run with 300,000 iterations and 50,000 burn-in. The habitat model was run with 550,000 iterations and 50,000 burn-in. Lastly, the landscape model was run with 500,000 iterations and 50,000 burn-in. Trace plots were assessed for each model parameter to ensure sufficient mixing within the parameter space. Additionally, we checked for model convergence using Geweke’s and Gelman and Rubin’s diagnostic criteria (Geweke 1991, Gelman and Rubin 1992). Lastly, we determined the effect of a predictor variable on hellbender occupancy and eDNA detection by assessing the mean posterior distribution with 85% credible intervals (CRI). If the 85% CRI did not overlap with zero, we determined that the predictor was positively or negatively associated with the respective parameter (i.e., ψ, θ, or p); positive if the CRI fell to the right of zero and negative if the CRI fell to the left of zero.
We ran post-hoc power analyses to determine how varying sampling effort at the field collection and qPCR levels influence the precision of the parameter estimates (msocc package, Stratton et al. 2020). We used estimates of ψ, θ, and p derived from our null model to simulate a detection dataset with 90 sites sampled three times, with nine qPCR replicates per sample. The simulation varies the number of samples taken per site or the number of qPCR replicates analyzed per sample and measures the width of the 95% CRI around each model parameter. We ran 100 simulations varying the number of samples and qPCR replicates between one and 10, and plotted the CRI widths for each sample size and each level of the model. We determined the sample size at which occupancy and detection precision is maximized by comparing the average CRI widths; the sample size at which the CRI widths stabilize provides insight into the most cost-effective sample size for field collection and qPCR (Sepulveda et al. 2021).
False Positive Detection Rate
Studies have noted that false positive detections (i.e., the detection of eDNA when the organism is not present) are non-negligible when analyzing eDNA data as false positives can derive from cross contamination within field collection and lab processing stages, poor primer design, or PCR errors (Ficetola et al. 2015, Lahoz-Monfort et al. 2016, Guillera-Arroita et al. 2017). We estimated the probability of obtaining a stage 1 false positive (i.e., θ10; eDNA presence in the environmental sample given species absence) and obtaining a stage 2 false positive (i.e., p10; detecting eDNA in a qPCR replicate given eDNA absence) using the R Shiny App eDNA 1.0 (Diana et al. 2021). The app implements the Bayesian hierarchical SODM developed by Griffin et al. (2020), which estimates site-specific probabilities of species presence while accounting for false positive and false negative detection error at both the sample collection and laboratory analysis stages. The model can also integrate detection data obtained from field surveys as an informative prior distribution when such data is available. We ran two models with and without detection data from our snorkeling surveys to estimate false positive rates under both circumstances. Because the model does not allow for missing data in the detection matrix, we only included the 45 sites that were sampled temporally and detections from the first spatial replicate if multiple samples were taken at the site. Both models were run with one MCMC chain for 20,000 iterations and 2,000 burn-in. Convergence was confirmed by assessing trace plots and Geweke’s diagnostic criteria (Geweke 1991).
