Skip to main content
Dryad logo

Data from: Multispecies site occupancy modeling and study design for spatially replicated environmental DNA metabarcoding

Citation

Fukaya, Keiichi; Kondo, Natsuko; Matsuzaki, Shin-ichiro; Kadoya, Taku (2021), Data from: Multispecies site occupancy modeling and study design for spatially replicated environmental DNA metabarcoding, Dryad, Dataset, https://doi.org/10.5061/dryad.3bk3j9kkm

Abstract

Although environmental DNA (eDNA) metabarcoding has become widely applied to gauge ecosystems in a noninvasive and cost-efficient manner, false negatives can occur due to various factors in its inherent multistage workflow. It is therefore essential to deal with this kind of species detection errors in eDNA metabarcoding to achieve accurate assessment of species distribution and diversity. To address this issue, we proposed a variant of the multispecies site occupancy model for eDNA metabarcoding studies and applied it to an eDNA metabarcoding dataset of freshwater fish communities collected in the Kasumigaura watershed in Japan.

Usage Notes

This is an archive of the dataset used for the analysis and script files needed to reproduce the results. There are the following five files:

  • 1_model_fitting.R: R script for fitting the multispecies site occupancy model proposed in the paper to an eDNA metabarcoding dataset of fish communities. The second half of the file also contains scripts for assessing and plotting the model-fit results.
  • 2_decision_analysis.R: R script for estimating the effectiveness of species detection under various study designs based on the model-fit results. The second half of the file contains scripts for drawing profiles of the estimated effectiveness of the species detection.
  • data.Rdata: This file contains sequence read count data for 50 fish species groups obtained via environmental DNA metabarcoding at 50 sites in the Kasumigaura watershed, Japan. The file is in binary format written out using the save() function in R. In addition to a three-dimensional array of sequence reads with species, site, and replicate dimensions, there are two vectors of covariates (riverbank and mismatch) used to explain the variation in the sequence read counts.
  • functions.R: This file defines auxiliary functions for the analyses performed using the two R scripts (1_model_fitting.R and 2_decision_analysis.R).
  • model.jags: JAGS model file that defines the multispecies site occupancy model to be fitted in the R script 1_model_fitting.R.

Funding

Environmental Restoration and Conservation Agency, Award: ERTDF, Nos. 4–1705 and 2–2001

Japan Society for the Promotion of Science, Award: KAKENHI, Nos 20H03010 and 20K06102