Enhancing neotropical fish monitoring using dietary DNA of detrivorous natural samplers
Data files
Mar 18, 2026 version files 1.69 GB
-
Analysis.zip
59.55 KB
-
bash_code.zip
1.24 KB
-
Metabar.zip
416.52 KB
-
Raw_data.zip
1.69 GB
-
README.md
7.04 KB
Abstract
Neotropical freshwater fish face alarming biodiversity loss, yet current capture-based monitoring methods are costly, invasive, and heavily reliant on taxonomic expertise. There is an urgent need for more efficient and accurate biomonitoring tools to assess the impacts of increasing anthropogenic pressures.
Natural samplers are emerging as a promising biomonitoring tool - living organisms that, through feeding, aggregate the DNA of species in their immediate environment. Here, we investigated whether abundant and widely distributed freshwater shrimp could provide reliable snapshots of local fish assemblages in large neotropical rivers using multi-marker metabarcoding analysis of their dietary DNA (dDNA).
Shrimp dDNA analysis revealed almost as many species (68 species) as an intensive 10-day inventory of the study area (70 species), and nearly three times more species than gillnet-based methods commonly used in surveillance programs. The generalist and opportunistic feeding behaviour of these detritivorous organisms enabled detection of a broad spectrum of species, including small fish typically overlooked by traditional surveys. The vast majority of fish taxa were identified at the species level thanks to nearly exhaustive barcoding reference databases, demonstrating the high taxonomic resolution of this approach.
Synthesis and applications: The relative ease of sampling and processing makes shrimp dDNA analysis particularly suitable for rapid biodiversity assessments, complementing observational approaches that provide data on fish abundance, biomass, or condition. As the cost of molecular analyses continues to decrease, this method offers an efficient tool for implementing large-scale monitoring programs in neotropical rivers and detecting localized ecosystem impacts of anthropogenic disturbances.
Dataset DOI: 10.5061/dryad.tht76hfdg
Date of last update: 2026-03-18
Description of the data and file structure
README: Enhancing neotropical fish monitoring using dietary DNA of detritivorous natural samplers
This dataset supports the manuscript "Enhancing neotropical fish monitoring using dietary DNA of detritivorous natural samplers". The study investigates the use of freshwater shrimp dietary DNA (dDNA) as a non-invasive tool for assessing fish biodiversity in Guianese rivers.
1. Dataset Overview
This repository contains all data and scripts necessary to reproduce the analyses presented in the manuscript investigating the use of freshwater shrimp dietary DNA (dDNA) as a non-invasive method for assessing fish biodiversity in Guianese rivers.
This Dryad package includes:
Raw sequencing data
Bioinformatic processing scripts
Intermediate abundance tables (ASVs)
Taxonomic assignment files
Processed datasets
R scripts for quality control and statistical analyses
Metadata necessary for full reproducibility
2. Directory Structure
Raw_data.zip/
Contains raw Illumina sequencing reads:
METASHRIMP2023.fastq.gz: Compressed FASTQ file containing raw paired-end reads generated on an Illumina MiSeq platform.
bash_code.zip/:
Contains the bioinformatic pipeline used for initial sequence processing:
01_sequence_filtering_and_denoising.sh: Bash script implementing the workflow for : Adapter trimming (Cutadapt), Read assembly, Quality filtering, Deplication, Denoising (Obitools).
Metabar.zip/
Contains intermediate abundance tables and taxonomic assignments for each mitochondrial marker:
Subfolders:
- MG2/ (COI marker)
- MiFish/ (12S rRNA marker – MiFish primers)
- teleo/ (12S rRNA marker – Teleo primers)
Each subfolder includes:
- Taxonomic assignment files (seq_teleo(MiFish or COImg2)_assigned.tab)
- scripts for downstream filtering and validation
Each "metabar" subfolder (MG2/, MiFish/, teleo/) contains the following files necessary to build a metabaR object:
reads.txt: ASV abundance table with ASVs in rows and PCR replicates in columns (raw read counts).motus.txt: Metadata for each ASV (MOTU). Includes:id: ASV identifiercount: Total number of reads.sequence: The DNA sequence of the ASV.length: Sequence length in base pairs.
pcrs.txt: Metadata for each PCR library. Includes:sample_id: Original sample ID.type: Sample type (e.g., "sample", "control").control_type: Type of negative control (e.g., "extraction","pcr" or "sequencing).tag_fwd / tag_rev: Combinatorial tags used for indexing.primer_fwd / primer_rev: Marker-specific primers.
samples.txt: Mapping table between Sample IDs and biological information.-
Maps each ESE unique identifier (e.g., ESE20586) to the shrimp species (e.g., Macr. brasiliense, Eurhyrinchus).
-
R Scripts (within Metabar subfolders)
Scripts used for data curation and validation using the metabaR package, including:
- Contaminant removal
- Sequencing depth filtering
- Tag-jump cleaning
- Validation of occurrences across technical replicates (aliquot sorting)
These scripts produce cleaned and curated ASV tables used in the final analyses.
Analysis.zip/:
Main R script used to generate all figures and statistical analyses presented in the manuscript, including:
- Non-metric multidimensional scaling (NMDS)
- Species accumulation curves
- Venn diagrams
- Principal component analysis (PCA)
- Fish size distribution boxplots
Running this script reproduces all main and supplementary figures.
Processed datasets in .rds or .xlsx format:
Fully merged and curated dataset used for downstream analyses
-
taxon_code: Unique 4-letter identifier for fish species. -
expert_correction: Binary (0/1). If 1, expert taxonomic advice was used to refine the sequence assignment or update the taxonomy to current standards. -
similarity_CO1 / _teleo / _MiFish: Percent identity match with the reference database.--> "not detected": Indicates the species was not detected by this specific marker.
-
size_mm: Standard length (mm). Note: Values are provided for all species based on FishBase. -
weight_g: Mean body mass (g).--> "NA": Indicates that the weight was not recorded (e.g., species not captured during the 10-day survey).
-
FOO_...: Frequency of Occurrence (%). -
RRA_...: Relative Read Abundance (%). -
metabarcoding_presence: Binary (0/1) detection via shrimp dDNA. -
gillnet_presence: Binary (0/1). Indicates if the species was caught specifically using gillnets during the survey. -
survey_presence: Binary (0/1) detection during the 10-day intensive inventory. -
num_WFD_captures: Total number of individuals of this species captured across all Water Framework Directive (WFD/DCE) monitoring campaigns in the study area. -
WFD_2008toWFD_2022: Binary (0/1) presence/absence data for each specific annual WFD monitoring campaign. -
num_SC_total: Total number of shrimp stomach contents (SC) where the fish species was detected (sum across all markers and individuals).
Binary presence/absence matrix combining the three mitochondrial markers
Metadata table including sampling information
This file contains the individual capture records for the fish specimens (tax_name column) sampled during the study in the Maroni River, including the day (day column) on which they were caught. Each row represents a single individual
These three RDS files contain binary presence-absence matrices for the fish species detected using three different genetic markers (CO1, MiFish, and Teleo, respectively). Each file represents the metabarcoding results for its specific marker.
- Rows (
rownames): Unique sample identifiers (e.g., ESE20586). - Columns (
colnames): Scientific names of the detected fish species. - Values: Binary data where 1 indicates that the species was detected in the sample and 0 indicates its absence.
3. Reproducibility
To fully reproduce the analyses:
Process raw reads using 01_sequence_filtering_and_denoising.sh
Run marker-specific R scripts in the Metabar/ subfolders
Load processed .rds files from Analysis/data/
Execute analysis.R
All scripts are annotated and designed to allow full reproducibility of the published results.
4. Software Requirements
- Cutadapt
- OBITools
- R (version ≥ 4.0 recommended)
- R packages: metabaR, vegan, tidyverse, ade4 and additional dependencies specified within scripts
