Accumulation of airborne, eukaryotic environmental DNA contamination: Raw sequencing data and demultiplexing info

Klepke, Martin 1 ; Sigsgaard, Eva1; Jensen, Mads1; Olsen, Kent2; Thomsen, Philip 1

Research facility: Aarhus University

Published Aug 26, 2022 on Dryad. https://doi.org/10.5061/dryad.bg79cnpcf

Abstract

Environmental DNA (eDNA) metabarcoding is increasingly being implemented as a non-invasive and efficient approach for biodiversity research and monitoring across ecosystems. However, accurate detection of species with eDNA requires robust experimental designs, as eDNA analysis carries a risk of contamination at every step of the fieldwork and laboratory process. Several studies focus on rigorous laboratory procedures and processing of sequencing data, but surprisingly little research investigates the process of background input of DNA in the field. For example, airborne DNA from localities outside the study area could potentially contaminate eDNA samples. Here, we use an experimental setup and eDNA metabarcoding to study the diversity and accumulation of airborne eukaryotic eDNA on exposed surfaces in the field. At two different natural locations, a coastal marine site, and a terrestrial grassland site, we placed open containers each filled with 0.5 litres of water, which was then sampled at eight successive time points after exposure to the surroundings. We found an accumulation of detected species richness in the samples, which reached its maximum at the end of the experiment, 24 hours after exposure. This result was consistent across both sites and across two markers (COI for eukaryotes and 12S for vertebrates). While many of the detected species were contaminants commonly found in eDNA studies, we also detected several other eukaryotic taxa. Most notable were metazoan species such as birds, fish and insects, likely originating from airborne transport of eDNA. We also found that increasing the number of PCR cycles tended to have a positive impact on richness for the unfiltered reads but a negative impact on the richness after bioinformatic filtering. Our results add to the sparse evidence that metazoan eDNA can be transported by air, which has wide implications for eDNA research and calls for increased implementation of field control samples.

We suggest you put the files in 14 separate folders in order to demultiplex. The files should be placed accordingly:

Folder 1: Aarhus CO1 PCR replicate one

M1_MD5.txt (to check sums)
M1_FKDL202559049-1a_HJ77HDRXX_L1_1.fq.gz
M1_FKDL202559049-1a_HJ77HDRXX_L1_2.fq.gz
M1_tags.list

Folder 2: Aarhus CO1 PCR replicate two

M2_MD5.txt (to check sums)
M2_FKDL202559050-1a_HJ77HDRXX_L1_1.fq.gz
M2_FKDL202559050-1a_HJ77HDRXX_L1_2.fq.gz
M2_tags.list

Folder 3: Aarhus Co1 PCR replicate three

M3_MD5.txt (to check sums)
M3_FKDL202559051-1a_HJ77HDRXX_L1_1.fq.gz
M3_FKDL202559051-1a_HJ77HDRXX_L1_2.fq.gz
M3_tags.list

Folder 4: Aarhus CO1 PCR replicate four

M4_MD5.txt (to check sums)
M4_FKDL202559052-1a_HJ77HDRXX_L1_1.fq.gz
M4_FKDL202559052-1a_HJ77HDRXX_L1_2.fq.gz
M4_tags.list

Folder 5: Vertebrate 40 cycles PCR replicate one

M5_MD5.txt (to check sums)
M5_FKDL202564481-1a_H35WWDSXY_L1_1.fq.gz
M5_FKDL202564481-1a_H35WWDSXY_L1_2.fq.gz
M5_tags.list

Folder 6: Vertebrate 40 cycles PCR replicate two

M6_MD5.txt (to check sums)
M6_FKDL202564442-1a_H35WWDSXY_L1_1.fq.gz
M6_FKDL202564442-1a_H35WWDSXY_L1_2.fq.gz
M6_tags.list

Folder 7: Vertebrate 45 cycles PCR replicate one

M7_MD5.txt (to check sums)
M7_FKDL202564482-1a_H35LKDSXY_L4_1.fq.gz
M7_FKDL202564482-1a_H35LKDSXY_L4_2.fq.gz
M7_tags.list

Folder 8: Vertebrate 45 cycles PCR replicate two

M8_MD5.txt (to check sums)
M8_FKDL202564444-1a_H35WWDSXY_L1_1.fq.gz
M8_FKDL202564444-1a_H35WWDSXY_L1_2.fq.gz
M8_tags.list

Folder 9: Vertebrate 50 cycles PCR replicate one

M9_MD5.txt (to check sums)
M9_FKDL202564445-1a_H35WWDSXY_L1_1.fq.gz
M9_FKDL202564445-1a_H35WWDSXY_L1_2.fq.gz
M9_tags.list

Folder 10: Vertebrate 50 cycles PCR replicate two

M10_MD5.txt (to check sums)
M10_FKDL202564483-1a_H35WWDSXY_L1_1.fq.gz
M10_FKDL202564483-1a_H35WWDSXY_L1_2.fq.gz
M10_tags.list

Folder 11: Mols CO1 PCR replicate one

M11_MD5.txt (to check sums)
M11_FKDL202564451-1a_HJVKFDRXX_L1_1.fq.gz
M11_FKDL202564451-1a_HJVKFDRXX_L1_2.fq.gz
M11_tags.list

Folder 12: Mols CO1 PCR replicate two

M12_MD5.txt (to check sums)
M12_FKDL202564452-1a_HJVKYDRXX_L1_1.fq.gz
M12_FKDL202564452-1a_HJVKYDRXX_L1_2.fq.gz
M12_tags.list

Folder 13: Mols CO1 PCR replicate three

M13_MD5.txt (to check sums)
M13_FKDL202564453-1a_HJVKYDRXX_L1_1.fq.gz
M13_FKDL202564453-1a_HJVKYDRXX_L1_2.fq.gz
M13_tags.list

Folder 14: Mols CO1 PCR replicate four

M14_MD5.txt (to check sums)
M14_FKDL202564454-1a_HJVKYDRXX_L1_1.fq.gz
M14_FKDL202564454-1a_HJVKYDRXX_L1_2.fq.gz
M14_tags.list

Following this folder structure, each folder will now contain two sequence data files (paired end sequencing), a barcode/tag file and an MD5 file for checking sums.

If you would like to demultiplex this data, all tag information needed is available in the ”list” files. Each tag file contains a number of samples, which are explained below:

P2.X.Y_Z: Pooled samples (see associated article). P2 refers to the project number. X denotes the field replicate number, Y denotes the time the sample was taken (Y=C are control samples taken in the lab) and Z refers to the sequencing library number.

P2.POS(X): Field possitive taken at the Mols site (See associated publication for explanation)

Pos_X: Positive PCR control using whaleshark DNA. One positive control was run with every PCR setup.

CNE: Extraction blanks (one was included for each round of extraction).

NTC: PCR blanks. Four PCR blanks were run in each PCR setup.

The ”list” files include the sample name followed by the PCR replicate number. The two following columns represent the tags used for each sample (both forward and reverse primer were tagged). Tags are consistent across PCR replicates.

After demultiplexing, you should be able to do as you please with the data.

If you want to follow the exact filtering and data analysis done in our study, we refer to the manuscript for further details after the demultiplex step. If you have any questions, feel free to send an email to Martin Johannesen Klepke with any questions you may have.

Accumulation of airborne, eukaryotic environmental DNA contamination: Raw sequencing data and demultiplexing info

Data files

Abstract

Methods

Usage notes

Works referencing this dataset