Skip to main content

Data from: eDNA metabarcoding of log hollow sediments and soils highlights the importance of substrate type, frequency of sampling and animal size, for vertebrate species detection

Cite this dataset

Ryan, Ethan et al. (2022). Data from: eDNA metabarcoding of log hollow sediments and soils highlights the importance of substrate type, frequency of sampling and animal size, for vertebrate species detection [Dataset]. Dryad.


Fauna monitoring often relies on visual monitoring techniques such as camera trappings, which have biases leading to underestimates of vertebrate species diversity. Environmental DNA (eDNA) has emerged as a new source of biodiversity data that may improve biomonitoring; however, eDNA based assessments of species richness remain relatively untested in terrestrial environments. We investigated the suitability of fallen log hollow sediment as a source of vertebrate eDNA, across two sites in south-western Australia - one with a Mediterranean climate and the other semi-arid. We compared two different approaches (camera trapping and eDNA metabarcoding) for monitoring of vertebrate species, and investigated the effect of other factors (frequency of species, timing of visits, frequency of sampling, body size) on vertebrate species detectability. Metabarcoding of hollow sediments resulted in the detection of higher species richness in comparison Hollow sediment detected higher species richness (29 taxa: six birds, three reptiles and 20 mammals) to metabarcoding of soil at the entrance of the hollow (13 taxa: three birds, two reptiles and eight mammals). We detected 31 taxa in total with eDNA metabarcoding and 47 with camera traps, with 14 taxa detected by both (12 mammals and two birds). By comparing camera trap data with eDNA read abundance, we were able to detect vertebrates through eDNA metabarcoding that had visited the area up to two months prior to sample collection. Larger animals were more likely to be detected, and so were vertebrates that were identified multiple times in the camera traps. These findings demonstrate the importance of substrate selection, frequency of sampling, and animal size, on eDNA based monitoring. Future eDNA experimental design should consider all these factors as they affect detection of target taxa. 



2.1 Study sites and camera trapping

Our study was conducted at two sites in south-western Australia (Figure 1), between October 2019 (end of Spring) and February 2020 (end of Summer). The Dryandra Woodland (DRY), is a large remnant of open, temperate eucalypt woodlands containing a variety of threatened fauna species including woylie (Bettongia penicillata) (Garkaklis, 2001) and numbat (Myrmecobius fasciatus) (Friend, 2005). This area has a Mediterranean climate; temperatures have a mean low of 8.5°C, mean high of 23.8°C and annual mean rainfall of 508.1 mm. Our sampling location within DRY was focused on the largest unfragmented area of the Lol Gray forest (-32.76737, 116.95231). The area is characterised by open woodlands of Eucalyptus wandoo, E. accedens and E. calophylla (Burrows et al., 1987) with minimal understory and high concentrations of leaf litter. The remnant blocks within DRY are a major hub of diverse fauna due to its transitional location between the hydric coast and more mesic west and south-west (Dryandra Woodland Management Plan No. 70, 2011). The second site was 600 km further east in the Great Western Woodland (GWW), a eucalypt woodland with a semi-arid climate and notable for being the world’s largest remaining temperate woodland. Our sites were characterised by a eucalypt over storey including E. salubris, E. celastroides and E. calycogona over mixed shrubs, herbs and grasses (NOVA Nickel Project - EPA Referral Supporting Document, 2014). Temperatures have a mean low of 10°C, a mean high of 25.2°C and an annual mean rainfall of 293.7 mm. The GWW study site was located within undisturbed habitat (-31.84886, 123.17553). The nearest site of human activity was a mine site 4 km from the edge of the study area. Species lists were generated from Atlas of Living Australia database records for DRY, and a report of a previously conducted survey from the GWW mine. Birds had the highest species richness of vertebrates across both sites (77 in GWW, 165 in DRY) followed by reptiles (GWW 40, DRY 28) and mammals (GWW 19, DRY 21). However, many of these species do not enter logs (McElhinny et al., 2006) and also are not be present all year round at the study sites (Wilson & Recher, 2001).

2.2 Sample collection

At each site, 20 logs were selected based on suitability criteria for fauna use and camera placement, including one major entrance/exit point, a majority shaded cavity and a diameter large enough to reliably collect samples (minimum size of 15 cm). Logs were a minimum of 4 m apart, and at each study site, all logs were sampled within an area of 500 x 500 m. To prevent contamination of samples, sterile gloves were worn during sampling and discarded after each sample was collected. For each log, a 50 ml pre-sterilized collection tube was filled with hollow sediment, from the entrance and within the log hollow. Hollow sediment contained some soil transported by wind into the log, but mostly consisted of old, decayed, heartwood material, broken down by fungal and bacterial action. Hollow formation typically begins when the trees are still standing, with heartwood draining into the trunk of the tree after external damage (Gibbons et al. 2000), and continues after tree collapse. As such, the sediment mostly consists of decayed cellulose, hemicellulose and lignin (Stokland et al. 2012). Hollow sediment samples were obtained by subsampling scrapings at five points along the entrance and floor of the hollow, and then combining, with depth ranging from several millimetres to 2 cm depending on amount of sediment in the hollow). Soil samples were obtained by randomly subsampling from five points (depth 2 cm) within a 50 cm2 area directly outside the hollow’s entrance. These soil subsamples were then combined to make one single soil sample. Samples were collected from each log at two time points; December 2019 and January 2020 for GWW, and January 2020 and March 2020 for DRY. In total there were 160 samples (soil n=80, hollow sediment n=80) from GWW (n=80) and DRY (n=80).

2.3 Sample processing and DNA extraction

Samples were pre-mixed in their 50 ml collection tubes using a Qiagen Tissuelyzer II (Qiagen, Germany) for 1 minute. These samples were then subsampled to 300 mg of either soil or hollow sediment, avoiding large rocks and woody detritus, for extraction. Samples were extracted using a DNeasy PowerLyzer PowerSoil Kit (Qiagen, Germany) and 100 µl elution on an automated Qiacube (Qiagen, Germany). An extraction control was added once per Qiacube run (every twelfth sample) using extraction reagents only (n= 15). 

2.4 DNA amplification and sequencing

Two primers were selected targeting short amplicons due to the degraded nature of eDNA (Ficetola et al., 2010), designed for mammals and all vertebrates, respectively. The mammal specific primers 16Smam1/2 targeted the mitochondrial 16S ribosomal gene (F: 5’-CGGTTGGGGTGACCTCGGA-3’; R:5’-GCTGTTATCCCTAGGGTAACT-3’; ~130bp, Taylor, 1996), while the vertebrate primers 12Sv5-F/R targeted the mitochondrial 12S gene ((F: 5’-TAGAACAGGCTCCTCTAG-3’; R: 5’-TTAGATACCCCACTATGC-3’; ~98bp, Riaz et al., 2011). Quantitative PCR amplification was carried out with neat extracts and dilutions of 1/10 and 1/100 to test for PCR inhibition (Murray et al. 2015) using a StepOne Plus (Applied BioSystems). No inhibition was detected from samples and so neat samples were used for fusion tagging. Positive (Bettongia penicillata) and negative controls (both PCR and extraction) were included on each PCR plate. 

The PCR mix for amplification was made up to 25 μl and contained 2.5 mM MgCl2 (Applied Biosystems), 1× PCR Gold buffer (Applied Biosystems), 0.25 mM dNTPs (Astral Scientific, Australia), 0.4 mg/ml bovine serum albumin (Fisher Biotec, Australia), 0.4 μmol/L forward and reverse primer, 1 U AmpliTaq Gold DNA polymerase (Applied Biosystems) and 0.6 μl of a 1:10,000 solution of SYBR Green dye (Life Technologies, USA). Quantitative PCRs were run on a StepOne Plus (Applied BioSystems) real-time qPCR instrument. Cycling conditions for 16S were 10 minutes at 95°C, and 55 cycles of 95°C for 12 seconds, 59°C at annealing temperature for 30 seconds and 70°C for 25 second, ending with 10 minute elongation at 72°C. Cycling conditions for 12sv5 were 10 minutes at 95°C, and 55 cycles of 94°C for 30 seconds, 51°C at annealing temperature for 30 seconds and 51°C for 1 minute, ending with 10 minute elongation at 72°C. All PCR mixes were prepared in a dedicated clean room to minimise contamination, with samples added in a separate laboratory in specialised UV cabinets. Samples were assigned a unique combination of fusion tag primers that contained a unique multiplex identifier (MID) tag between 6-9bp, the gene-specific primer and Illumina’s sequencing adaptors, producing an average fragment length of 218bp. Fusion tagged reactions using the same cycling conditions as the qPCR, were carried out in triplicate for each sample and included extraction controls, a PCR negative control was included on each plate. We did not fusion tag positive controls, as we used DNA from a species we expected to find at our sites and did not want to introduce the possibility of cross-contamination). Triplicate reactions were used to reduce the effects of PCR stochasticity (Murray et al. 2015), and the same fusion tag combination was used between each replicate per sample. A single step fusion protocol was used with no reuse of index combinations to minimize contamination with any previous runs. Samples were then pooled into approximately equimolar concentrations to create a PCR amplicon library that was size-selected to 160-300bp, using a PippinPrep 2% ethidium bromide cassette (Sage Science, Beverly, MA, U.S.A). The library was then quantified using Qubit Fluorometric Quantitation (Thermo Fisher Scientific) and sequenced as per Illumina MiSeq platform sequencing protocols for single-end sequencing with a 300 cycle V2 reagent kit with a standard V2 flow cell.

2.5 Sequence filtering and taxonomic assignment

Raw sequence reads were quality filtered and demultiplexed (filtered to minimum length 50bp with erroneous barcodes removed, and primers removed) using OBITools (Boyer et al., 2016). Sequences were concatenated, denoised and assigned to a zero-radius Operational Taxonomic Unit (ZOTU) table using Usearch V5.0 (Edgar et al., 2011). ZOTUs were then curated using LULU (on default settings), which identifies erroneous sequences by taking sequence similarity and co-occurrence patterns into account (Frøslev et al., 2017). Taxonomy was assigned by matching ZOTU sequences to a reference database using Basic Local Alignment Search Tool (BLASTn) on a high-performance cluster computer (Pawsey Supercomputing Centre Perth, WA, Australia) against the online reference database 

Genbank ( with a minimum of 95% query coverage and 90% identity. Final taxonomic identifications were assigned using a lowest common ancestor (LCA) algorithm with a minimum of 95% identity (MousaviMousavi-Derazmahalleh et al., 2021). When the absolute value for the difference between % identity of ZOTUs was < 0.5, species level taxonomy was not returned and the ZOTU was dropped to the closest common ancestor.

The results of the LCA script analysis were compared against existing species diversity data for the sites to ensure detected species were accurate. Several species were detected in eDNA that were not recorded as present in that area; however, if they had a single sister species that was recorded as present, the ZOTU was reattributed. ZOTUs refined to genus-level that only had a single species of said genus present at the site were similarly marked as that species. Taxa adjusted in this way included Dasyurus, Bettongia, Trichosurus, Antechinus, Isoodon, and Phascogale. When there were multiple potential reassignments, the ZOTU was left at the closest taxonomic level and labelled “sp.” One ZOTU was reassigned beyond species level (Zaglossus bruijni to Tachyglossus aculeatus), as it was the only monotreme native to this region.

2.6 Camera traps

Each log was monitored with a wireless Reconyx HyperFire or Hyperfire 2 IR Camera (Reconyx, USA) set to high sensitivity, 5-image rapid-fire starting in November 2019 in Dryandra and October 2019 in GWW (roughly two months before the first associated eDNA collection date). Cameras were set up using nearby logs and tree trunks or the use of portable stakes and focused on the likely point of entry into the associated hollow, and the area in front of it sampled for soil. The variable landscape meant that the distance between hollow opening and camera varied from <1 m to ~10 m, altering the amount of the surrounding woodland in view. On later visits, cameras were sometimes adjusted if there was an abundance of non-fauna photos due to unwanted movement in the field of view. Camera trap data was processed by logging all species detected with the associated date. The longest time span between the last camera trap detection and the sample collection date was 65 days but the average number of days was 27.2±0.92SE (n=434). Because vertebrate were detected in a wide variety of circumstances, a numerical scale was used to describe the detected behaviour in a way that likely corresponded to maximum eDNA dispersal. A 0 indicated the animal was seen on camera but not near the hollow entrance: a 1 indicated the animal moved to an area where sheddings may be deposited outside the hollow, but with no physical contact with the soil (i.e. perched on the top of the hollow entrance); a 2 indicated the animal moved across or onto the soil directly in front of the hollow but did not enter/exit the hollow; and a 3 indicated that the animal entered or exited the hollow. Camera trap data was collected at the same time as the eDNA samples, providing two time frames containing roughly two months of data. SD cards and batteries were replaced during the first collection date.

2.7 Statistical Analysis

All statistics were performed using R 4.0.2 (R Core Team, 2020). Samples with low sequencing depth (<642 reads) were removed as this was the minimum sampling depth to reach an approximate asymptote of ZOTU richness (see Appendix S1). False detections (eDNA detections of species that do not occur in the study regions) occurred at low read abundances, and by removing taxa from samples where they made up less than 0.06% of the reads of that sample, these false detections were removed. Sequences present in the extraction controls, taxa not known to inhabit terrestrial Australia and taxa not recorded on the site that are common contaminants during PCR (such as ungulates and human sequences) were removed from the dataset using the ‘phyloseq’ package 1.32.0 (McMurdie & Holmes, 2013). We calculated ZOTU richness for each substrate (soil and hollow sediment) and tested the differences between substrates using a generalized linear model with poisson distribution for each of our assays (16S and 12Sv5) and sites (GWW and Dryandra). We calculated accumulation curves of the number of ZOTUs as a function of the number of soil/hollow sediment samples, comparing sites and sample type using the ‘accumcomp’ function in the ‘BiodiversityR’ package 2.11 -3 (Kindt & Coe, 2005). Plots were illustrated using ‘ggplot2’ 3.3.2 (Wickham, 2016). Correlation between eDNA and camera traps distance matrices of species presence/absence per site was tested by running a Mantel test (999 permutations) with a dataset of all vertebrates. An identical test was conducted on a dataset with all vertebrates but with mammals removed.

We used a generalized linear model with a binomial distribution to test whether body mass (g), the number of interactions, or the length of time since the last interaction affected the eDNA detection of the animal. This model included only observations with camera trap detections and the binomial eDNA detection outcome of 1 - detection, or 0 - no detection. Body mass was determined based on the mean mass of the species, rather than attempting to estimate masses of vertebrates from camera footage. The variables body mass (g) and the number of interactions were log transformed to normalize the distribution. Genus-level association between camera trap and eDNA data was selected to allow the inclusion of Climacteris. As C. rufus was seen hundreds of times across the majority of hollows, but due to the presence of one other Climacteris species historically recorded at the GWW site, our Climacteris ZOTUs could not be conclusively linked to this species.

Usage notes

All necessary data and script files are contained in the dataset, including fastq sequence reads, csvs containing all collected data and script to run the bioinformatics pipeline and filtration of ZOTU read abundance. An in-depth list can be found in README.txt with explanations on use.