A large scale temporal and spatial environmental DNA biodiversity survey of marine vertebrates in Brazil following the Fundão tailings dam failure

Lines, Rose 1 ; Juggernauth, Manjeeti2 ; Peverley, Georgia1 ; Keating, James2 ; Simpson, Tiffany1 ; Mousavi-Derazmahalleh, Mahsa 1 ; Bunce, Michael1 ; Berry, Tina 1 ; Taysom, Alice3 ; Bernardino, Angelo F.4 ; Whittle, Phillip2

Research facility: Curtin University

Published May 03, 2021; Updated Oct 27, 2023 on Dryad. https://doi.org/10.5061/dryad.sqv9s4n3t

Abstract

Seawater contains a wealth of genetic information, representing the biodiversity of numerous species residing within a particular marine habitat. Environmental DNA (eDNA) metabarcoding offers a cost effective, non-destructive method for large scale monitoring of environments, as diverse taxonomic groups are detected using metabarcoding assays. A large-scale eDNA monitoring program of marine vertebrates was conducted across three sampling seasons (Spring 2018, Autumn 2019 and Spring 2019) in coastal waters of Brazil. The program was designed to investigate eDNA as a testing method for long term monitoring of marine vertebrates following the Fundão tailings dam failure in November 2015. While no baseline samples were available prior to the dam failure there is still value in profiling the taxa that use the impacted area and the trajectory of recovery. A total of 40 sites were sampled around the mouths of eight river systems, covering approximately 500km of coastline. Metabarcoding assays targeting the mitochondrial genes 16S rRNA and COI were used to detect fish, marine mammals and elasmobranchs. We detected temporal differences between seasons and spatial differences between rivers/estuaries sampled. Overall, the largest eDNA survey in Brazil to date, revealed 69 families from Class Actinopterygii (fish), 15 species from Class Chondrichthyes (sharks and rays), 4 species of marine and estuarine mammals and 23 species of conservation significance including 2 species of endangered dolphin. Our large-scale study reinforces the value eDNA metabarcoding can bring when monitoring the biodiversity of coastal environments and demonstrates the importance of collection of time-stamped environmental samples to better understand the impacts of anthropogenic activities.

https://doi.org/10.5061/dryad.sqv9s4n3t

Description of the data and file structure

The spreadsheet Sample ID information.xlxs contains the information for each sample as labelled in the fasta files.
The spreadsheet physical WQ data 2018-2019 20230905.xls contains data on environmental factors. An explanation of data variables are in the tab, data description.

Bioinformatic analysis was performed in accordance with Mousavi-Derazmahalleh et al., (2021). eDNAFlow, an automated, reproducible and scalable workflow for analysis of environmental DNA (eDNA) sequences exploiting Nextflow and Singularity. Molecular Ecology Resources. doi: 10.1111/1755-0998.13356. The eDNAFlow is available from GitHub (https://github.com/mahsa-mousavi/eDNAFlow)).

The two slurm files contain the scripts used to analyse the dataset and were run on the Pawsey Supercomputing Centre in Kensington, Western Australia which uses slurm workload manager.
run_eDNAFlow_16S.slurm
run_eDNAFlow_COI.slurm

Data was produced through 8 sequencing libraries that were quality filtered, demultiplexed and size selected prior to being concatenated into 16S fasta and COI fasta files, which may be viewed using a text editor:
16snest_merged_FTP260_elib08_10_28_29.fasta
elasmo_merged_FTP260_elib09_10_26_27.fasta

Raw sequences files from the Illumina MiSeq for the eight sequencing libraries have also been included. These can been analysed using the two slurm files referenced above, or bioinformatics scripts of the users choice.
MSRun330-FTP260_S1_L001_R1_001.fastq.gz
EFMS-Run-11-Elib08_S1_L001_R1_001.fastq.gz
EFMSRun12-Elib09_S1_L001_R1_001.fastq.gz
EFMSRun13-Elib10_S1_L001_R1_001.fastq.gz
EFMSRun32-Elib26_S1_L001_R1_001.fastq.gz
EFMSRun33-Elib27_S1_L001_R1_001.fastq.gz
EFMSRun34_Elib28_S1_L001_R1_001.fastq.gz
EFMSRun35-Elib29_S1_L001_R1_001.fastq.gz

Two files containing the MID-tag (barcode) combinations for each primer set (16S and COI), pertaining to each DNA extract, have been included.
16S_barcodes_all_libraries.xlsx
COI_elasmo_barcodes_all_libraries.xlsx
There are separate tabs in the spreadsheet for each sequencing library as well as a collated list.

Creation of zero-radius operational taxonomic unit (ZOTU) sequences and their relative abundance table was achieved by applying the unoise3 algorithm of USEARCH using --minsize of two, followed by post-clustering curation with LULU (Frøslev et al., 2017). The ZOTU sequences were aligned to the nucleotide database of Genbank using BLASTN (Altschul et al., 1990). ZOTUs were then assigned to their lowest common ancestor using the LCA script (Mousavi-Derazmahalleh et al., 2021), using qCov 100, percentage identity 97 and Diff 1. The zotu files for each assay have been included and may be viewed using a text editor.
16S_zotus.fasta
COI_zotus.fasta

A large scale temporal and spatial environmental DNA biodiversity survey of marine vertebrates in Brazil following the Fundão tailings dam failure

Data files

Abstract

README: A large scale temporal and spatial environmental DNA biodiversity survey of marine vertebrates in Brazil following the Fundão tailings dam failure

Description of the data and file structure

Methods

Works referencing this dataset