Data from: Refining sampling efforts for fish diversity assessment in subtropical urban estuarine and oceanic waters using environmental DNA with multiple primers
Data files
Nov 27, 2024 version files 14.76 GB
-
1-12S.tar.gz
4.35 GB
-
2-MiFish-U.tar.gz
5.46 GB
-
3-MiFish-E.tar.gz
4.95 GB
-
README.md
1.18 KB
Abstract
The environmental DNA (eDNA) approach is an emerging tool for monitoring marine biodiversity. However, the sampling effort needs optimization according to the site characteristics and target taxonomic groups. In this study, we optimized the eDNA sampling effort in terms of sample volume and number of replicates to monitor the diversity of marine vertebrates (mainly fish) in Hong Kong's subtropical waters that show a gradient of estuarine to oceanic waters. To maximize detection, we used three pairs of metabarcoding primers (12S‐v5, MiFish‐U, and MiFish‐E). We compared vertebrate diversity in 78 water samples, ranging from 1 to 10 L, collected from oceanic and estuarine sites. Metabarcoding yielded a total of 140 vertebrate species, of which 18 were unique to the estuarine site, 66 unique to the oceanic site, and 56 shared between both sites. The detected species were predominantly ray‐finned fish (136 species), and the three primer pairs exhibited differential sensitivity toward different taxa, especially cartilaginous fish and cetaceans. Increasing sampling volume per replicate generally increased the total detected species, average species per replicate, and species coverage, and sampling 3 or 4 × 4 L represented the most efficient sampling effort for the estuarine and oceanic sites, respectively. The diversity analysis revealed that sampling >2 L per replicate reduced variability and improved diversity analysis. The results also showed that a larger sampling volume per replicate increased the probability of detecting endangered, indicator, invasive, and elusive species, with 4 L representing the most efficient volume. This study recommended sampling 4 L per replicate and 3 replicates for estuarine and 4 for oceanic sites, respectively for effectively monitoring marine fish in subtropical waters using the eDNA approach.
https://doi.org/10.5061/dryad.08kprr58c
These are the raw sequences produced from an eDNA study in estuarine and oceanic sites in Hong Kong.
Description of the data and file structure
This dataset includes raw sequencing data of three primer pairs: 12S-V5, MiFish-U, and MiFish-E. The pair-end raw sequences were demultiplexed into 802 fastq files for every primer, with XXX_1.fq as forward reads, XXX_2.fq as reverse reads. However, due to the use of PCR-free library preparations, the reads were mixed-oriented. The reads could be re-oriented using *cutadapt, by adding –revcomp command during primer removal steps.
We studied two sites in this study: “estuarine” and “oceanic”. The filenames starting with “west” represent the estuarine samples, whereas those with “east” represent the oceanic samples. The samples are further divided into eDNA extracted from 1 L(#01-20), 2 L (#21-30), 4 L (#31-35), and 10 L (#36-39) seawater, whereas #40 represents negative control for each site. The naming is the same for three datasets:
1-12S (12S-V5)
2-MiFish-U
3-MiFish-E
Marine water samples were collected 1 m beneath the water surface using an electric water pump and randomly distributed into sterile sampling bags of different volumes. The samples were collected from an estuarine site (22.39423, 113.90088) on 24 March 2023 and an oceanic site (22.351714, 114.350139) on 15 July 2023. The water samples were filtered using 0.45 micron pore size glass fiber membranes, and the eDNA was extracted using CTAB-chloroform methods. The target genes were amplified using 12S-v5 (Riaz), MiFish-U, and Mifish-E primers. PCR reaction solutions containing KOD polymerase premix (Toyobo, Osaka, Japan), 200 nM primers, and 1.0 ng/μL DNA template were subjected to a 5-min initial denaturation stage at 98 °C, followed by 35 cycles of 5 s 98 °C, 5 s 60 °C, and 10 s 68 °C, and a 5 min final extension stage at 68 °C. The field control was also included as a negative control. The successful reactions were purified from the gel using a gel extraction kit (Qiagen, Hilden, Germany), adjusted to the same concentration, and pooled (10 samples per run) before being sent to a commercial sequencing service provider (Novogene Co. Ltd., Beijing, China) for paired-end 150 bp library preparation and sequencing on a Novaseq 6000 platform (Illumina, San Diego, CA, USA). The sequences were quality-filtered, demultiplexed (sabre), removed of primer sequences (cutadapt), and denoise & merged (Qiime2-dada2).