Skip to main content
Dryad

A large scale temporal and spatial environmental DNA biodiversity survey of marine vertebrates in Brazil following the Fundão tailings dam failure

Cite this dataset

Lines, Rose et al. (2023). A large scale temporal and spatial environmental DNA biodiversity survey of marine vertebrates in Brazil following the Fundão tailings dam failure [Dataset]. Dryad. https://doi.org/10.5061/dryad.sqv9s4n3t

Abstract

Seawater contains a wealth of genetic information, representing the biodiversity of numerous species residing within a particular marine habitat. Environmental DNA (eDNA) metabarcoding offers a cost effective, non-destructive method for large scale monitoring of environments, as diverse taxonomic groups are detected using metabarcoding assays. A large-scale eDNA monitoring program of marine vertebrates was conducted across three sampling seasons (Spring 2018, Autumn 2019 and Spring 2019) in coastal waters of Brazil. The program was designed to investigate eDNA as a testing method for long term monitoring of marine vertebrates following the Fundão tailings dam failure in November 2015. While no baseline samples were available prior to the dam failure there is still value in profiling the taxa that use the impacted area and the trajectory of recovery. A total of 40 sites were sampled around the mouths of eight river systems, covering approximately 500km of coastline. Metabarcoding assays targeting the mitochondrial genes 16S rRNA and COI were used to detect fish, marine mammals and elasmobranchs. We detected temporal differences between seasons and spatial differences between rivers/estuaries sampled. Overall, the largest eDNA survey in Brazil to date, revealed 69 families from Class Actinopterygii (fish), 15 species from Class Chondrichthyes (sharks and rays), 4 species of marine and estuarine mammals and 23 species of conservation significance including 2 species of endangered dolphin. Our large-scale study reinforces the value eDNA metabarcoding can bring when monitoring the biodiversity of coastal environments and demonstrates the importance of collection of time-stamped environmental samples to better understand the impacts of anthropogenic activities. 

README: A large scale temporal and spatial environmental DNA biodiversity survey of marine vertebrates in Brazil following the Fundão tailings dam failure

https://doi.org/10.5061/dryad.sqv9s4n3t

Description of the data and file structure

The spreadsheet Sample ID information.xlxs contains the information for each sample as labelled in the fasta files. 
The spreadsheet physical WQ data 2018-2019 20230905.xls contains data on environmental factors. An explanation of data variables are in the tab, data description.

Bioinformatic analysis was performed in accordance with Mousavi-Derazmahalleh et al., (2021). eDNAFlow, an automated, reproducible and scalable workflow for analysis of environmental DNA (eDNA) sequences exploiting Nextflow and Singularity. Molecular Ecology Resources. doi: 10.1111/1755-0998.13356. The eDNAFlow is available from GitHub (https://github.com/mahsa-mousavi/eDNAFlow)).

The two slurm files contain the scripts used to analyse the dataset and were run on the Pawsey Supercomputing Centre in Kensington, Western Australia which uses slurm workload manager.
run_eDNAFlow_16S.slurm
run_eDNAFlow_COI.slurm

Data was produced through 8 sequencing libraries that were quality filtered, demultiplexed and size selected prior to being concatenated into 16S fasta and COI fasta files, which may be viewed using a text editor:
16snest_merged_FTP260_elib08_10_28_29.fasta
elasmo_merged_FTP260_elib09_10_26_27.fasta

Raw sequences files from the Illumina MiSeq for the eight sequencing libraries have also been included. These can been analysed using the two slurm files referenced above, or bioinformatics scripts of the users choice.
MSRun330-FTP260_S1_L001_R1_001.fastq.gz
EFMS-Run-11-Elib08_S1_L001_R1_001.fastq.gz
EFMSRun12-Elib09_S1_L001_R1_001.fastq.gz
EFMSRun13-Elib10_S1_L001_R1_001.fastq.gz
EFMSRun32-Elib26_S1_L001_R1_001.fastq.gz
EFMSRun33-Elib27_S1_L001_R1_001.fastq.gz
EFMSRun34_Elib28_S1_L001_R1_001.fastq.gz
EFMSRun35-Elib29_S1_L001_R1_001.fastq.gz

Two files containing the MID-tag (barcode) combinations for each primer set (16S and COI), pertaining to each DNA extract, have been included.
16S_barcodes_all_libraries.xlsx
COI_elasmo_barcodes_all_libraries.xlsx
There are separate tabs in the spreadsheet for each sequencing library as well as a collated list.

Creation of zero-radius operational taxonomic unit (ZOTU) sequences and their relative abundance table was achieved by applying the unoise3 algorithm of USEARCH using --minsize of two, followed by post-clustering curation with LULU (Frøslev et al., 2017). The ZOTU sequences were aligned to the nucleotide database of Genbank using BLASTN (Altschul et al., 1990). ZOTUs were then assigned to their lowest common ancestor using the LCA script (Mousavi-Derazmahalleh et al., 2021), using qCov 100, percentage identity 97 and Diff 1. The zotu files for each assay have been included and may be viewed using a text editor.
16S_zotus.fasta
COI_zotus.fasta

Methods

Seawater samples were collected at 40 sites along the coastline of Brazil. DNA was extracted using the DNeasy blood and tissue kit (Qiagen). Two PCR assays used in this study were the 16S rRNA primer set Fish_Sygnathid_Short (Nester et al., 2020) and the COI Elasmobranch multiplex (West et al., 2020).  Multiplex identifier-tagged amplicons were blended into libraries and sequenced on an Illumina MiSeq instrument. 

Environmental factors including turbidity (Tu; NTU), temperature (Tm; °C), dissolved oxygen (DO; mg/L), chlorophyll-a fluorescence (Ch; μg/L, salinity (S; PSU), conductivity (Co; mS/cm) and photosynthetically active radiation (PAR; μMol/m²/s) were measured near the surface using an oceanographic grade XR Data Logger (RBR XRX-620). 

Funding

BHP Brazil