The use of environmental DNA (eDNA) as a new method of ecological monitoring is widely applied. Although eDNA can provide important information on the distribution and biomass of particular taxa, an organism’s DNA sequences remain unaltered throughout its existence, which complicates identifying crucial events, including reproduction, with high accuracy. We thus examined DNA methylation as a novel source of information from eDNA, considering that methylation patterns of eggs and sperm released during reproduction differ from those of somatic tissues.
Despite its potential applications, little is known about eDNA methylation, including its stability and methods for detection and quantification. Therefore, we conducted tank experiments and performed methylation analysis targeting 18S rDNA through bisulfite amplicon sequencing.
Methylation of eDNA was not affected by degradation and was equivalent to the rate of genomic DNA from somatic tissues. Unmethylated DNA, which is abundant in the ovary, was detected in eDNA during reproductive activity of fish.
These results indicate that eDNA methylation is a stable signal reflecting genomic methylation and demonstrate that germ cell-specific methylation patterns can be used as markers for detecting reproductive activity.
Tank Experiment 1: Monitoring the methylation levels of eDNA undergoing degradation over time
In Tank Experiment 1, to examine the methylation levels of eDNA undergoing degradation over time, water samples were collected immediately after the removal of fish from the breeding tanks for up to 72 h. Three breeding tanks (Tank 1-1, Tank 1-2, and Tank 1-3) containing 9 L of water each and one without fish (a blank tank) were prepared (Figure S1). A water sampling tube was installed in the tanks, and the water inlet was covered with a net. A transparent perforated partition board was mounted in the middle of each breeding tank to prevent fluctuations in the eDNA methylation rate caused by sperm and egg release. Six male and six female fish were placed in separate compartments. All tanks were covered with lids, and boards were placed between the tanks to avoid contamination. Fish were fed once a day, ensuring no food was left over, and water temperature was maintained at 27.5 ± 0.5 °C. The lights were turned on at 10:00 JST and off at 22:00 JST. After 5 days of acclimation and eDNA saturation in the tanks, all fish were quickly removed from each breeding tank. After the removal of the fish, the tank conditions were maintained, except that feeding was stopped. At 0, 1, 3, 12, 24, 48, and 72 h after removing the fish, 500 ml water was sampled from each of the four tanks. Water samples were filtered immediately through GF/F filters (pore size: 0.7 µm) (GE Healthcare Life Science, Little Chalfont, UK) and stored at −25 °C until DNA extraction. A total of 28 samples were collected, with seven samples collected from each of the four tanks. All equipment used for water collection and filtration was bleached in 0.1% sodium hypochlorite before use.
Tank Experiment 2: Monitoring eDNA methylation during fish spawning
Mature zebrafish can spawn continuously up to 2–3 times per week, and an interval of a few days can be used to optimise mating induction (Nasiadka & Clark, 2012). In Tank Experiment 2, spawning was artificially induced by separating the sexes and bringing them together at the beginning of the light period, and progress was closely monitored using eDNA analysis.
Three breeding tanks (Tank 2-1, Tank 2-2, and Tank 2-3) filled with 9 L of water and a blank tank were used for Experiment 2. The breeding and equipment conditions were the same as those used in Experiment 1. Five males and five females were placed in separate compartments. After 4 days of acclimation, eDNA sampling was performed. The first day of sampling was considered day 1, and water sampling continued until day 4. At 10:00 on day 2, the partitions were removed and fish of both sexes were allowed to mix. The water collection regimes are listed in Table 1. Water was added after each sampling to compensate for the decrease, except for periods of less than 1 hour after light exposure, which may interfere with mating behaviour. The collected 500 ml of tank water was immediately filtered through a GF/F filter, and the filter was stored at -25 °C. Sixteen samples were collected per tank for a total of 64 samples, including the blank tank.
eDNA extraction
The extraction of eDNA from the filters was performed using a DNeasy Blood & Tissue Kit (Qiagen , Hilden, Germany) following the protocol published by the eDNA Society (Minamoto et al., 2021) with slight modifications, i.e. using buffer ATL instead of buffer AL (Wu & Minamoto, 2023). The concentration of the extracted DNA was determined using a Qubit dsDNA Quantification Assay Kit (Thermo Fisher Scientific, USA) with a Qubit 3.0 Fluorometer (Thermo Fisher Scientific). To prevent cross-contamination, eDNA was extracted in a different laboratory room physically separated from gDNA extraction or PCR.
gDNA extraction
A DNeasy Blood & Tissue Kit (Qiagen, Germany) was used following the manufacturer’s tissue extraction protocol to obtain gDNA from skin, brain, testis, and ovary samples (n = 6) collected from 12 zebrafish (six males and six females). The DNA concentrations were quantified as described above.
Primer development for qPCR
The nuclear ribosomal internal transcribed spacer (ITS), a nuclear DNA marker with high detection sensitivity due to tandem repeats (Minamoto et al., 2017), was used to quantify the concentration of eDNA in the collected water samples. We downloaded the sequence of the zebrafish rDNA ITS2 region from the National Center for Biotechnology Information (NCBI) database and searched for a primer set with no intraspecific variation in the sequence covered by the primers, melting temperatures of around 60 °C, Tm difference between primers within 2.0 °C, and an amplificon length of 150 bp or less. Primers with an amplification length of 132 bp were identified, and a probe was designed to hybridise to the respective internal regions (Table 2). The specificity of the primer set was confirmed using Primer Blast (NCBI).
qPCR
eDNA concentrations were quantified by qPCR using a StepOnePlus Real-Time PCR system (Thermo Fisher Scientific). All runs were performed using triplicates with 20 µL per sample. The reaction mix contained 10 µL 2xTaqMan Environmental Master Mix 2.0 (Thermo Fisher Scientific), 0.1 µL AmpErase Uracil N-glycosylase (Thermo Fisher Scientific), 900 nM of each primer (Dre-ITS2-F1/Dre-ITS2-R1: Table 2), 125 nM probe (Dre-ITS2-P: Table 2), and 2 µL template DNA. The thermocycling conditions were 50 °C for 2 min, 96 °C for 10 sec followed by 55 cycles of 96 °C for 15 sec and 60 °C for 1 min. Standard curves were generated using linear DNA prepared by treating a circular plasmid (pEX-A2J2) containing the target sequence. Standards were prepared through dilution to 3 × 10^1 to 3 × 10^4 copies per reaction. Each PCR run included a negative control using ultrapure water instead of template DNA.
Primer development for bisulfite amplicon sequencing
We developed a region-specific methylation assay using bisulfite amplicon sequencing (BSAS) (Masser et al., 2015) to measure the methylation rates of the internal regions of 18S rDNA. The 18S is arranged in tandem repeats with the ITS region. It exhibits high detection sensitivity, similar to that of the ITS region, and sequence information has been recorded for numerous species, rendering it highly useful as an eDNA marker (Othman et al., 2021). We downloaded the zebrafish 18S rDNA sequence from the NCBI database and used MethPrimer (https://www.urogene.org/methprimer/) to generate two sets of primers that matched the target region after bisulfite base substitution. Two additional sets of primers were designed manually. The following conditions were considered to design the sequencing primers: 20–40 bp per primer an amplicon length of approximately 250 bp (allows iSeq100 paired-end sequencing), Tm in the 50–60 °C range, >10 CpG sites in the amplicon, and no CpG sites in the primers (bisulfite can cause TG or CG, biasing the amplification efficiency). Four primer sets were thus generated and were tested in vitro. The amplification efficiency and specificity of the primer sets were examined using agarose gel electrophoresis and Sanger sequencing, and one set of primers was selected. Primers with added adapter sequences (Dre_18S_BF2A and Dre_18S_BR2A) were used for BSAS (Table 2). The amplicon of the developed primers was 265 bp long and contained 19 CpG sites.
We also conducted Sanger sequencing of amplicons derived from skin gDNA of males and females and from ovarian gDNA with sequencing primers designed separately to ensure that the primer sequences did not cover sites of intraspecific mutations that would suppress PCR efficiency and to determine reference sequences for use in BSAS.
Bisulfite amplicon sequencing
Bisulfite treatment
DNA was bisulfited and cleaned up using an EpiTect Fast DNA Bisulfite Kit (Qiagen), according to the manufacturer’s instructions. gDNA was diluted and prepared in TE buffer, and 20 µL (400 ng) was processed. For eDNA, 20 µL was processed after ensuring that the amount of eDNA was within the range recommended by the manufacturer (1 ng to 2 µg). Finally, 15 µL bisulfite-treated DNA was collected and TE buffer (pH 8.0) was added up to 30 µL. DNA was then stored at -25 °C.
Library preparation
The BSAS protocol using next-generation sequencing (NGS) was performed as described by Masser et al. (2015). The first PCR specific to 18S rDNA was performed using bisulfite-treated eDNA or gDNA as the template. TaKaRa EpiTaq HS (Takara Bio, Shiga, Japan) was used as the PCR reagent. The reaction mixture contained 2.5 µL of 10×EpiTaq PCR Buffer, 2.5 µL of 25 mM MgCl₂, 3 µL of dNTPs (2.5 mM each), 0.125 µL of EpiTaq HS (5 U/µL), 400 nM each of primers (ER3/ER1: Table 2), 2 µL bisulfite-treated DNA, and ultrapure water up to 25 µL. The thermocycling conditions were 98 °C for 10 min followed by 45 cycles of 98 °C for 30 s, 52 °C for 30 s, and 72 °C for 30 s. Each PCR run included a negative control with ultrapure water instead of template DNA. As a few eDNA samples had significantly reduced DNA content due to degradation, all eDNA samples from Tank Experiment 1 (collected as described in Section 2.3) and gDNA samples (collected as described in Section 2.4) were subjected to the four reactions of the first PCR, and the products were pooled. The eDNA samples from Tank Experiment 2 were used to produce two replicates of the first PCR, and the products were pooled. PCR products were purified using the SPRI-select Reagent Kit (Beckman Coulter Inc., USA) according to the manufacturer’s instructions, diluted to 0.1 ng/µL, and used as a template for the second PCR.
The second PCR was performed by adding Illumina P5/P7 adaptor sequences (Illumina, USA) and 8-bp index sequences at both ends of the amplicon. The reaction mixture for the second PCR contained 6 µL 2×KAPA HiFi HotStart ReadyMix (Kapa Biosystems, South Africa), 300 nM each of F/R primer, 1 µL template, and ultrapure water to a total volume of 12 µL. The thermocycling conditions were 95 °C for 3 min, followed by 12 cycles of 98 °C for 20 s, 72 °C for 20 s, and finally 72 °C for 5 min. Products of the second PCR were pooled and size-selected using E-Gel SizeSelect II agarose gels (2%; Thermo Fisher Scientific). The desired sequence length was verified on an Agilent 2100 BioAnalyzer (Agilent Technologies, USA), and the library was sequenced using an iSeq 100 Sequencing System (Illumina).
Data processing
The NGS data were processed using USEARCH v10 (Edgar, 2010). First, paired-end reads were generated using the ‘fastq_mergepairs’ command, and reads with more than five alignment mismatches, those with less than 90% alignment match, and those with alignments less than 16 bases were discarded. Next, the primer sequences were truncated from the paired-end reads using the fastx_truncate command. Quality filtering was performed using the “fastq_filter” command to remove reads with a predicted error rate exceeding 1% and using the “fastx_uniques” command to identify the set of unique sequences.
Identity filtering
Identity filtering was performed on the PCR-amplified sequences to remove sequences with low identity to the zebrafish reference sequence. To roughly remove reads with many mismatches while allowing for mismatches in CpG sites substituted for TGs via bisulfite treatment and to simplify the following process, pairwise alignment to the reference was performed to filter for 80% identity using the ‘usearch_global’ command in USEARCH v11. When the sequence corresponding to the reference CpG site was TG, it was converted to CG, allowing for variable sequences due to methylation modifications. Subsequently, the sequences were filtered for 95% identity using the ‘usearch_global command’ again to restrict the sequences to target species. Finally, the data were produced to contain the sequence and number of reads for each unique sequence after identity filtering.
Quantification of methylation rates
The methylation rate in the target region of each sample was calculated as follows: (sum of CGs)/(sum of CpG sites). The quantification limit was set to a read depth of 10. In general, the ‘methylation rate’ refers to the bulk-level metric of the sample. For example, a sample containing a 1:1 ratio of 100% to 0% methylated cells will exhibit a methylation rate of 50%. To investigate methylation at the single molecule level, we used single-read methylation (read methylation rate) in addition to the ‘methylation rate’ (Kerr et al., 2023). The read methylation rate for each DNA sequence was calculated as follows: number of CGs in the sequence/ number of CpG sites in the sequence = 19. Then, as an indicator of DNA with low methylation, the percentage of unmethylated DNA sequences with a read methylation rate of 0% (CG=0) in each sample was calculated.
Statistical analyses
To investigate the impact of time on eDNA concentrations, we used a linear mixed model (LMM) included in the lmerTest package of R software version 4.1.2. The DNA concentration (log-transformed +1) was the response variable, elapsed time (log-transformed +1) was the explanatory variable, and the tank number was included as a random effect. We also used a generalised LMM (GLMM) with beta distribution using the glmmTMB package to investigate the impact of time on the methylation rate in the target region of eDNA. The methylation rate was the response variable, elapsed time was the explanatory variable, and tank number was included as a random effect.
In addition, the differences in methylation rates in the target region were analysed by fitting a GLMM with beta distribution among the five groups with regard to gDNA, extracted from four different organ tissues (skin, brain, testis, and ovary), and eDNA, collected immediately after fish removal (after 0 h) in Experiment 1. The methylation rate was the response variable, the sample group was the explanatory variable, and the number of tanks or individuals was included as a random effect. Multiple comparison tests were performed using Tukey–Kramer post-hoc analysis.
Ethics approval for conducting animal experiments
Animal experiments were conducted following all relevant laws and guidelines of the countries where the studies were conducted.