A hierarchical model for eDNA fate and transport dynamics accommodating low concentration samples
Cite this dataset
Augustine, Ben (2024). A hierarchical model for eDNA fate and transport dynamics accommodating low concentration samples [Dataset]. Dryad. https://doi.org/10.5061/dryad.8gtht76wc
Abstract
Environmental DNA (eDNA) sampling is an increasingly important tool for answering ecological questions and informing aquatic species management . Challenges of using eDNA include determining species source location(s) and accurately and precisely measuring low concentration eDNA samples, especially considering inhibitory compounds and multiple sources of ecological and measurement variability. These challenges must be overcome to optimize our use of modeling frameworks like the eDNA Integrating Transport and Hydrology (eDITH) model. To better understand eDNA fate and transport dynamics, our ability to estimate parameters within the eDITH framework, and our ability to reliably quantify low concentration samples, we developed a hierarchical model and used it to evaluate a fate and transport experiment. Our model addresses several low concentration challenges by modeling the number of copies in each PCR replicate as latent variables with a count distribution and conditioning detection and quantification on replicate copy number. We provide evidence that the eDNA removal rate was not constant through time, estimating that over 80% of eDNA was removed over the first 10 m, traversed in 41 seconds. After this initial period of rapid decay, eDNA decayed slowly with consistent detection through our furthest site 1km from the release location, traversed in 250 seconds. We show that the eDITH model parameters can be difficult to estimate in this scenario. Our model further allowed us to detect extra-Poisson variation in the allocation of copies to replicates. Despite not observing evidence for inhibition as typically quantified using internal positive controls in conjunction with a binary decision rule (e.g., $\Delta$Cq>3), we hypothesized this overdispersion could be due to inhibitors. We extended our hierarchical model to accommodate a continuous effect of inhibitors, and used our model to provide evidence for the inhibitor hypothesis and explore the implications, if true. We show that inhibitors can cause substantial underestimation of eDNA site concentration, bias eDITH model parameter estimates, and attribute measurement variability erroneously to ecological variability. While our model is not a panacea for all challenges faced when quantifying low eDNA concentrations, it provides a framework for a more complete accounting of uncertainty that can be further tested and refined.
README: A Hierarchical Model for eDNA Fate and Transport Dynamics Accommodating Low Concentration Samples
https://doi.org/10.5061/dryad.8gtht76wc
Description of the data and file structure
All files used for data analysis and resulting files (posteriors, etc.) are in the "Data Analysis" folder on Zenodo.
All files used for simulation analyses are in the "Simulation" folder on Zenodo, as are all simulation results (posteriors, etc.).
Data Description
The field data are located in greenhollow_techrep 12_4_2.csv on Dryad.
Metadata for greenhollow_techrep 12_4_2.csv is in green hollow metadata.csv on Dryad.
Data Analysis
1. The data are in greenhollow_techrep 12_4_23.csv. See files to fit models for data processing.
2. The nimble model files are "Release NimModel X.R", where X is one of the four models.
3. Custom MCMC functions (inhibitor models only) are in "State Samplers.R".
4. Test scripts to run 1 chain for each model are in "Run X.R".
5. Scripts to run multiple chains in parallel for each model are in "Run X Parallel.R".
6. The multi-chain posteriors from 5 above are stored in "S1.R", "S1b.R", "S2.R", and "S2b.R", which
correspond to Inhibitor PL, Inhibitor Exponential, Null PL, and Null Exponential. These posteriors
are plotted in .pdf files with the same names.
7. The script to process the multi-chain posteriors is "Process Models.R".
8. Scripts to run 1 chain for each model to compute conditional WAIC are in "Run X WAIC conditional.R"
9. Output from the WAIC scripts from 8 are "output_WAIC_conditional_X.R".
10. The script to process the conditional WAIC results is "Process WAIC.R".
11. The script to do posterior predictive checks is "PP Checks.R".
12. The file to plot the raw data is "Plot Cq shift.R".
13. .jpg files are produced in the files used for processing, mostly from step 7 above.
Simulation
1. Test scripts to simulate data and fit models are "testscript X.R", where X is one of the four models.
2. Data simulators for the null and inhibitor models are "sim.data.R" and "sim.data.inhibitor.R".
3. The nimble model files are "Release NimModel X.R".
4. Custom MCMC functions (inhibitor models only) are in "State Samplers.R".
5. The code used to simulate data used in Simulation Study 1 is "Simulate Datasets X", where X is
S1-S4. S1=inhibitor PL, S2=inhibitor exponential, S3=null PL, S4=null exponential.
6. The simulate data sets are "SX_datasets.RData", where X=1-4.
7. Files to fit the model for each simulated data set in parallel are "Run Parallel Inhibitor SX.R", for X in 1,2,
and "Run Parallel SX.R" for X in 3,4.
8. The files in 7 store the posteriors in the folders "Sims_SX". 100 data sets x 3 chains x 4 models
9. The files to process the simulation posteriors are in "Process Sims SX.R".
10. Files in 9 produce plots and tables for each model, "SX.summary.RData".
11. After 10, the file to combine results across these models is "Combine Scenario Results.R".
This is just used to make a table.
Funding
United States Geological Survey