Effects of sampling seasons and locations on fish environmental DNA metabarcoding in dam reservoirs
Minamoto, Toshifumi et al. (2020), Effects of sampling seasons and locations on fish environmental DNA metabarcoding in dam reservoirs, Dryad, Dataset, https://doi.org/10.5061/dryad.zkh189374
Environmental DNA (eDNA) analysis has seen rapid development in the last decade, as a novel biodiversity monitoring method. Previous studies have evaluated optimal strategies, at several experimental steps of eDNA metabarcoding, for the simultaneous detection of fish species. However, optimal sampling strategies, especially the season and the location of water sampling, have not been evaluated thoroughly. To identify optimal sampling seasons and locations, we performed sampling monthly or at two-monthly intervals throughout the year in three dam reservoirs. Water samples were collected from 15 and 9 locations in the Miharu and Okawa dam reservoirs in Fukushima Prefecture, respectively, and 5 locations in the Sugo dam reservoir in Hyogo Prefecture, Japan. One liter of water was filtered with glass-fiber filters and eDNA was extracted. By performing MiFish metabarcoding, we successfully detected a total of 21, 24, and 22 fish species in Miharu, Okawa, and Sugo reservoirs, respectively. From these results, the eDNA metabarcoding method had a similar level of performance compared to conventional long-term data. Furthermore, it was found to be effective in evaluating entire fish communities. The number of species detected by eDNA survey peaked in May in Miharu and Okawa reservoirs, and in March and June in Sugo reservoir, which corresponds with the breeding seasons of many of fish species inhabiting the reservoirs. In addition, the number of detected species was significantly higher in shore, compared to offshore samples in the Miharu reservoir, and a similar tendency was found in the other two reservoirs. Based on these results, we can conclude that the efficiency of species detection by eDNA metabarcoding could be maximized by collecting water from shore locations during the breeding seasons of the inhabiting fish. These results will contribute in the determination of sampling seasons and locations for fish fauna survey via eDNA metabarcoding, in the future.
Environmental DNA samples were collected from three reservoirs. The first PCR was carried out with a 12 μl reaction volume containing 6.0 μl of 2×KAPA HiFi HotStart ReadyMix (KAPA, Biosystems, Wilmington, MA, USA), 0.36 μl of each primer (10μM), 4.28 μl of sterilized distilled H2O and 1.0 μl of the template. The final concentration of each primer in this reaction mixture was 0.3 μM. The thermal cycle profile after an initial 3 min denaturation at 95°C was as follows (40 cycles): denaturation at 98°C for 20 s, annealing at 65°C for 15 s and extension at 72℃ for 15 s, with a final extension at 72°C for 5 min. The purified first PCR products were used as templates for the second PCR.
In the second PCR, the Illumina sequencing adapters and the 8 bp identifier indices were added using forward and reverse fusion primers (forward: 5’-AATGATACGGCGACCACCGAGATCTACAXXXXXXXXACACTCTTTCCCTACACGACGCTCTTCCCATCT-3’, reverse: 5’-CAAGCAGAAGACGGCATACGAGATXXXXXXXXGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT-3’). The second PCR was carried out with a 12 μl reaction volume containing 6.0 μl × KAPA HiFi HotStart ReadyMix, 2 μl of each primer (1.8 μM), 1.0 μl of sterilized distilled H2O and 1.0 μl of template. The final concentration of each primer in this reaction mixture was 0.3 μM. The thermal cycle profile after an initial 3 min denaturation at 95°C was as follows (12 cycles): denaturation at 98℃ for 20 s; combined annealing and extension at 72°C (shuttle PCR) for 20 s, with a final extension at 72°C for 5 min. The indexed second PCR products were pooled (i.e. one pooled second PCR product that included all samples).
The pooled libraries were loaded on a 2% E-Gel SizeSelect (Thermo Fisher Scientific, Waltham, MA, USA) and the target size (approximately 370 bp) was collected. The DNA size distribution of the library was estimated using an Agilent 2100 BioAnalyzer (Agilent, Santa Clara, CA, USA), and the library concentration was quantified using a Qubit dsDNA HS assay kit and a Qubit 3.0 (Thermo Fisher Scientific, Waltham, MA, USA). The amplicon libraries were sequenced on the MiSeq platform at Ryukoku University using a MiSeq v2 Regent Kit for 2×150 bp pair-end (Illumina, CA, USA) according to the manufacturer’s protocol. The raw data for this MiSeq run are deposited as a zip file.
All data preprocessing and analyses of MiSeq raw reads were performed using USEARCH v10.0.240 (Edgar 2010) as follows: (1) Paired-end reads (forward and reverse reads) were merged using the “fastq_mergepairs” command with a default setting. During this process, too short reads (< 100 bp) after tail trimming and the paired reads with too many differences (> 5 positions) in the aligned region (ca. 65 bp) were discarded, (2) primer sequences were removed from the merged reads using the “fastx_truncate” command, (3) The reads without primer sequences were then filtered using the “fastq_filter” command to remove low quality reads with an expected error rate (Edgar and Flyvbjerg 2015) of > 1% and short reads of < 100 bp, (4) The preprocessed reads were dereplicated using the “fastx_uniques” command and all singletons, doubletons, and tripletons were removed from the subsequent analysis following the recommendation by Edgar (2010), (5) The dereplicated reads were denoised using the “unoise3” command to generate amplicon sequence variants (ASVs) and remove all putatively chimeric, erroneous sequences and partial ASVs with less than 10 reads and (6) Finally ASVs were subjected to taxonomic assignments to species names using the “usearch_global” command with a sequence identity of >98.5% (two nucleotide differences allowed) with the reference sequences and a query coverage of ≥90%.
After the taxonomic assignments, some modifications had to be made since, (1) some closely-related species could not be distinguished using the amplified region of 12S rRNA gene, and (2) the program returned species that were unlikely to inhabit the study areas (e.g. marine species) or taxonomic groups other than fish (e.g. mammals). In these two scenarios, the taxonomic assignment were modified as follows: In the first scenario, some closely-related species were merged and assigned to the genus, but if there was only one species from the taxa around study sites, it was reassigned to the single species. The distribution range of detected species was determined based on the results of conventional field surveys (the National Census on River and Dam Environments) and existing literature (Miyadi et al. 1976). In the second scenario, those species were removed from the data. In addition, according to Miya et al. (2015), MiFish-U primers amplify teleost fish DNA. Therefore, cartilaginous fish were excluded from the data and only teleost fish data was used in this study.
To remove possible contaminants that were detected from the negative control samples, we subtracted contaminant reads from field samples as follows (Nguyen et al. 2015; Port et al. 2016): read counts obtained from field blanks and extraction blanks were subtracted from each field sample processed on the same day, and read counts obtained from PCR blanks were subtracted from each field sample included in the PCR run. These cleaned data are shown in Table S1.
This dataset include a supplementary table (Table S1) and raw data of next generation sequencing for the paper entitled "Effect of sampling seasons and locations in fish environmental DNA metabarcoding in dam reservoirs".
FastqFiles.zip contains all fastq files for this study.
fastq_file_descriptions.xlsx shows the description of each fastq file.
HAYAMI_Reservoirs_eDNA_MB_TableS1.docx contains a supplemrntary Table (Table S1).
Water Resources Environment Center, Japan
Water Resources Environment Center, Japan