Data from: Rapid detection of subterranean fauna from passive sampling of groundwater eDNA
Data files
Aug 02, 2024 version files 208.96 MB
-
Passive_18S_taxonomy_zotutable.xlsx
-
Passive_18S_zotus.fasta
-
Passive_18S.fasta.zip
-
Passive_COI_otus.fasta
-
Passive_COI_OtuTable_taxonomy.xlsx
-
Passive_COI.fasta.zip
-
Passive_sample_metadata_dryad.xlsx
-
README.md
Abstract
Groundwater is an essential source of freshwater that supports surface ecosystems as well as organisms adapted to living underground. The impacts of anthropogenic climate change, extraction, and pollution pose major threats to groundwater ecosystem health, prompting a need for efficient and reliable means to detect and monitor subterranean faunal communities. Conventional survey of subterranean fauna relies on the collection of organisms for morphological identification, which can be biased, labour intensive, and often indeterminate at lower taxonomic levels. Environmental DNA (eDNA)-based methods have been shown to dramatically improve on stygofaunal surveys, but currently rely on time-consuming active water filtration that limits the number of samples that can be processed. Passive eDNA sampling, which involves submersion of material (e.g. filter membrane, sponge, etc.) into the sampled environment for a fixed period, has previously shown promise as a viable alternative to active filtration in aquatic ecosystems and may be applicable to groundwater systems. Here, we compared groundwater eDNA collected from active pump filtered water samples to membranes submerged in water for 10 min and 24 h, and haul-net samples morphologically identified, from bores at two geographically distinct locations in Western Australia. Our results show that while the relative abundance of eDNA in groundwater (measured through qPCR) is 100-800 times lower in passive samples, the diversity of species detected is comparable between passive and active eDNA collection. Additionally, standard metabarcoding assays (18S and COI) of passive eDNA samples detected most subterranean orders identified morphologically (12/17), and this proportion may be improved with increased sampling and application of DNA extraction methods that increase DNA yield. Our findings demonstrate that passive eDNA sample collection is a non-invasive survey method with the potential to improve the efficiency and level of replication of stygofaunal surveys but will benefit from further development.
README: Data from: Rapid detection of subterranean fauna from passive sampling of groundwater eDNA
- Author contact: mieke.vanderheyde@curtin.edu.au
- Date of data collection (single date, range, approximate date): 2021
- Geographic location of data collection: Pilbara, Western Australia, Australia
SHARING/ACCESS INFORMATION
- Licenses/restrictions placed on the data: CC0
- Recommended citation for this dataset: van der Heyde, Mieke et al. (2024). Data from: Rapid detection of subterranean fauna from passive sampling of groundwater eDNA [Dataset]. Dryad. https://doi.org/10.5061/dryad.cfxpnvxb2
DATA & FILE OVERVIEW
File List:
A) Passive_COI.fasta
B) Passive_18S.fasta
C) Passive_COI_otus.fasta
D) Passive_18S_zotus.fasta
E) Passive_COI_taxonomy_OtuTable.xlsx
F) Passive_18S_taxonomy_zotuTable.xlsx
G) Passive_sample_metadata_dryad.xlsx
H) README.md
Description of the data and file structure
The two gene regions sequenced are 18S and COI. The files "Passive_COI.fasta" and "Passive_18S.fasta" contain the demultiplexed sequences processed using eDNAFlow (https://github.com/mahsa-mousavi/eDNAFlow).
The file "Passive_18S_zotus.fasta" contains the sequences Zero radius operational taxonomic units (ZOTU) that result from denoising the 18S sequences generated through metabarcoding. Similarly, the "Passive_COI_otus.fasta" contains the OTU sequences for the COI sequences, except they are clustered at 97% because the COI region is more variable than 18S.
The "Passive_COI_taxonomy_OtuTable.xlsx" and the "Passive_18S_taxonomy_zotuTable.xlsx" contain the OTU tables and assigned taxonomy for the respective metabarcoding assays. The first nine columns contain information about the OTUs including the OTU ID and taxonomic information separated by Kingdom, Phylum, Order, Family, and Taxon. Taxon indicates the lowest taxonomic identification assigned, which could be species, genus, family, or higher order level.
- Niche was assigned for the metazoan taxa differentiating between SY-stygofauna, PoSY-potential stygofauna, TF-troglofauna, PoTF-potential troglofauna, and Not-not subterranean.
- The "stygofauna" column identifies whether the taxa is stygofauna "stygo", troglofauna "trog" or not subterranean at all "Not"
- The "subfauna" column divides the taxa between subterranean fauna and potential subterranean fauna "subfauna", and non-subterranean fauna taxa "Not"
- The remaining columns in these files indicate the sample as the column header and the number of sequence reads of each OTU was detected from that sample.
The "Passive_sample_metadata_dryad.xlsx" file contains the metadata for the samples including the date they were collected (Date_collected), the date they were filtered or processed (Date_filtered), the Location (Barrow Island or Bungaroo Creek), The Site (bore identification), the volume of sample filtered (mL), the longitude and latitude in decimal degrees and the sample type.
The Sample type was coded as follows:
- GW-filtered groundwater sample
- PS10-passive sample submerged for 10 minutes
- PS24-passive sample submerged for 24 hours
- CONTROL-field controls collected during sampling
- LAB CONTROL-extraction controls from the lab
- GW DEEP-deeper water samples that were also filtered (not included in further analysis)
- Missing data are coded as NA
Methods
Study sites
The Pilbara is an ancient, geomorphologically stable region in northwest Western Australia that dates back 3.5 billion years (Buick et al., 1995). The climate is arid subtropical with hot, humid summers, and mild, dry winters. Rainfall is low and highly variable, most prevalent between November and April (summer wet season), and dependent on the passage of summer monsoon systems including cyclones.
Our study was conducted at two sites. The first, Barrow Island, is a limestone island located approximately 60 km off the mainland Pilbara coast, of Western Australia (Moro and Lagdon, 2013). The primarily karstic geology of the island provides a habitat for a well-documented and diverse subterranean fauna assemblage (Humphreys et al., 2013). The groundwater on Barrow Island comprises an anchialine aquifer (Humphreys, 2001b); that is, physico-chemically stratified waters where a freshwater lens that originates from seasonal rainfall overlies seawater with a transitional zone in between. The brackish transitional zone, from marine to freshwater, is of comparable thickness to the freshwater layer, having been expanded due to tidal forces (Humphreys, 2001b; Pohlman, 2011). Access to the groundwater and in situ stygofauna sampling is typically conducted via bores that have been installed for specific operational or monitoring purposes. The bores are typically constructed of 50 or 100 mm diameter, Polyvinyl Chloride (PVC) casing that is slotted at discrete intervals depending on their purpose. Slots vary from 0.5 - 5 mm in width. Bores with larger slot sizes can have mesh across them to prevent debris from accumulating inside the bore. The depth of bores varies depending on their location on the island (bores at higher elevations are deeper in order to reach the water table) and the initial purpose of the bore.
The second study site, Bungaroo Creek, is located approximately 162 km southeast of Barrow Island on mainland Australia and is an ephemeral tributary of the Robe River in the Pilbara. The groundwater associated with Bungaroo Creek, and the wider Robe River, is fresh with highly transmissive alluvial aquifer geology (Clark et al., 2021). The interstices of this geology form a habitable space for a significant subterranean faunal community in the area. Access to the groundwater, and stygofaunal communities, typically occurs through boreholes drilled for the purpose of water abstraction, groundwater level monitoring, physicochemical monitoring, and mineral exploration. Monitoring or water abstraction bores are typically cased using PVC with 1 mm slots to allow water flow, while mineral exploration bores are generally uncased below ground level.
Sample Collection and Processing
Stygofaunal specimens and eDNA samples were collected from four bores in Bungaroo Creek (September 2021) and four bores on Barrow Island (November 2021). Selected bores were known to be high in stygofauna diversity (taxon richness and abundance) (Humphreys et al., 2013). Whole specimens used to assess the baseline for morphological species diversity were collected using haul nets (similar to plankton nets – see Saccò et al., 2022b), which were 45 mm or 95 mm in diameter, depending on the internal diameter of the PVC casing. The haul-net sampling method incorporated six net hauls per bore, consistent with the Western Australian Environmental Protection Authority (EPA) (2021) technical guidance (EPA, 2021). Haul net samples were sorted and identified using a dissecting microscope and specimens of each morphotype were stored in glass vials in 100% ethanol and kept frozen at -20°C. Stygofaunal experts (authors NS and MTG) provided the morphological identifications.
eDNA Sample Collection
Water samples for active filtration were collected for eDNA analysis after hauling whole animal specimen samples used for morphological identification and reference DNA barcoding. Four 1-L water samples were collected from the upper 2 m of the water column using a sterile, 1 L plastic disposable bailer (groundwater samples-AFGW). Passive samples consisted of frames made of garden netting holding five 0.45 µm Supor polyethersulfone membranes 47 mm in diameter (Pall Corporation, Port Washington, USA). These were left in the bore, within the top 2 m, for either 10 min (PS10) or 24 h (PS24). All equipment, including haul nets, was sterilized between bores in 10% bleach for 10 min and rinsed in reverse osmosis (RO) water, and disposable gloves were used at each borehole to prevent contamination. Fresh bleach solution and RO water were used at each borehole for equipment that was not decontaminated in the lab ahead of time. Samples of the RO water used for rinsing equipment (i.e. field controls) were collected to assess sources of contamination.
eDNA and haul net samples were kept on ice until they could be transferred to the refrigerator for storage at 4°C at the end of the collecting day. Active filtered groundwater (AFGW) samples were filtered within 24 h of collection across 0.45 µm Supor polyethersulfone membranes (47 mm) using a peristaltic Pall Sentino Microbiology pump (Pall Corporation, Port Washington, USA). For samples with high turbidity, up to two membranes were used to increase the volume of water filtered. Filter membranes from the passive samples were removed from the sampling frames into individual sample bags. The samples were immediately frozen and stored at -20°C prior to, and on-ice during their transportation to, the Trace & Environmental DNA (TrEnD) Laboratory at Curtin University in Perth, Western Australia. All further processing was performed at this facility.
eDNA Laboratory Processing
DNA was extracted from filter membranes within one month of collection and filtering using a DNeasy Blood and Tissue Kit (Qiagen) with the following modifications: 540 µL of ATL lysis buffer and 60 µL of Proteinase K were used during the cell lysis phase and digested overnight at 56° C. A total of seven DNA extraction controls, containing the solutions and plastics supplied in the extraction kit, were used and processed alongside all eDNA samples in order to detect any laboratory or cross-contamination of samples. All eDNA extractions (post-lysis stage) were performed using an automated DNA extraction platform (QIAcube; Qiagen) with a customised eDNA protocol that elutes the DNA off of the silica membrane in 100 µL of elution buffer (10 mM Tris-Cl, 0.5 mM EDTA; pH 9.0).
DNA extracts were screened for quality and quantity of DNA using quantitative PCR (qPCR) with a neat and 1/10 (18S) or neat and 1/5 dilution (COI) (Table 1) to determine the presence of inhibitors and the quantity of target template molecules present in each DNA extract (Murray et al., 2015). Each qPCR reaction for the 18S universal assay and the COI invertebrate assay was carried out with the same master mix with a total volume of 13 µL containing: 1X AmpliTaq Gold®PCR buffer (Life Technologies, Massachusetts, USA), 2 mmol/L MgCl2, 0.25 mM dNTPs, 0.4 µmol/L each of forward and reverse primers (Integrated DNA Technologies, Australia), 0.4 mg/mL BSA (Fisher Biotec, Australia), 0.3 µL of 5X SYBR® Green (Life Technologies), 1 U AmpliTaq Gold® DNA Polymerase (Life Technologies), 1 µL (18S) or 3 µL (COI) of genomic DNA template. The cycling conditions were initial denaturation at 95°C for 5 min, followed by 50 cycles of 95°C for 30 s, annealing at 51-52°C (depending on the assay) for 30 s, 72°C for 45 s, a melt curve stage of 95°C for 15 s, 60°C for 1 min, and 95°C for 15 s, and a final extension at 72° C for 10 min. All qPCR plates included a negative control and a positive control.
For metabarcoding, each sample, including all controls, was assigned a unique combination of multiplex identifier (MID) tags for both assays. These MID tags were incorporated into fusion-tagged primers, and none of the primer-MID tag combinations had been used previously in the laboratory to prevent cross-contamination. Fusion PCRs were done in duplicate (18S) or quadruplicate (COI) to minimize PCR stochasticity. Additional PCR replication was employed for the COI assay because of the higher cycle threshold (CT-the cycle number at which amplification is detectable) values indicating a low initial copy number of eDNA for that assay. The PCR mixes were prepared in a dedicated ultra-clean room before DNA was added. The PCRs were performed with the same conditions as the standard qPCRs described above, except the total volume of PCR was doubled to 25 µL and the DNA template was increased to 2-10 µL depending on sample performance in qPCR. Samples were then pooled into approximately equimolar concentrations to produce a PCR amplicon library that was size-selected to remove any primer-dimer that may have accumulated during fusion PCR. Size selection was performed (250-600-bp for 18S, 200-500 bp for COI) using a PippinPrep 2% ethidium bromide cassette (Sage Science, Beverly, MA, U.S.A). Libraries were cleaned using a QIAquick PCR Purification Kit (Qiagen, Germany) and quantified using Qubit Fluorometric Quantitation (Thermo Fisher Scientific). Sequencing was performed on the Illumina MiSeq platform as per manufacturer's instructions using 500-cycle paired-end V2 for the 18S assay, and 300-cycle single-end V2 for the COI assay.
Table 1: PCR Primers used for amplification and sequencing of groundwater eDNA samples
PCR assay |
Target Taxa |
Primer name |
Oligonucleotide sequence (5’-3’) |
Target length (bp) |
Annealing temperature (°C) |
Reference |
18S |
Eukaryotes |
18S_1F |
GCCAGTAGTCATATGCTTGTCT |
~340–420
|
52 |
(Pochon et al., 2013) |
18S_400R |
GCCTGCTGCCTTCCTT |
|||||
COI |
Invertebrates |
fwhF2 |
GGDACWGGWTGAACWGTWTAYCCHCC |
~205 |
51 |
(Vamos et al., 2017) |
fwh2Rn |
GTRATWGCHCCDGCTARWACWGG |
Bioinformatics
DNA sequences were processed using eDNAFlow, an automated bioinformatics workflow designed for the analysis of eDNA metabarcoding data (Mousavi-Derazmahalleh et al., 2021). This pipeline uses AdaptorRemoval (Schubert et al., 2016) and FASTQC (Andrews, 2010) to quality filter and demultiplex sequences using ‘obitools’ (Boyer et al., 2016). The minimum Phred quality score was set at 20, with a minimum alignment of 12 for the paired-end reads (18S only), a minimum length of 100 bp, and a maximum of 2 primer mismatches allowed, but no mismatches allowed in the MID-tag sequences. The 18S demultiplexed sequences were then denoised using USEARCH unoise3 (Edgar, 2016) to create Zero-radius operational taxonomic units (ZOTUs) with a minimum abundance of 8 reads which is the recommended default for the unoise3 command. Because the COI region is so variable (more so than 18S), sequences were clustered using USEARCH cluster_otus into OTUs with 97% sequence similarity. The OTU sequences were then queried against a custom barcode reference library for stygofauna (Guzik et al. in prep), specimens barcoded for this project (See Supplementary Information), and GenBank (NCBI) using blastn with the following parameters (-outfmt "5" -perc_identity 95 -qcov_hsp_perc 95 -max_target_seqs 10). Search parameters were more relaxed for the COI assay (-perc_identity 85) initial screening to maximise the number of taxonomic identifications for OTUs in this incredibly variable region, but the final taxonomic assignment for both assays was performed using a simple lowest common ancestor algorithm on MEGAN (Huson et al., 2007) with a minimum score of 450 for 18S and 275 for COI based on the lengths of the amplicons. Niches (i.e. stygofaunal, troglofaunal, non-subterranean taxon groups) were assigned to all metazoan taxa identified from OTUs after van der Heyde et al. (2023). This allowed us to parse stygofaunal and troglofaunal taxa from non-subterranean taxa.