Skip to main content
Dryad

Tidewater goby and estuarine fish records from seining, qPCR and metabarcoding data for Southern California estuaries in 2023

Cite this dataset

Lafferty, Kevin (2024). Tidewater goby and estuarine fish records from seining, qPCR and metabarcoding data for Southern California estuaries in 2023 [Dataset]. Dryad. https://doi.org/10.25349/D9P60T

Abstract

Many studies have shown that environmental DNA (eDNA) sampling can be more sensitive than traditional sampling. For instance, past studies found a specific qPCR probe of a water sample is better than a seine for detecting the endangered tidewater goby, Eucyclogobius newberryi. Furthermore, a metabarcoding sample often detects more fish species than a seine detects. Less consideration has been given to sampling costs. To help managers choose the best sampling method for their budget, I estimated detectability and costs per sample to compare the cost-effectiveness of seining, qPCR and metabarcoding for detecting endangered tidewater gobies as well as the associated estuarine fish community in California. Five samples were enough for eDNA methods to confidently detect tidewater gobies, whereas seining required twice as many samples. Fixed program costs can be high for qPCR and seining, whereas metabarcoding had high per-sample costs, which led to changes in relative cost-effectiveness with the number of locations sampled. Under some circumstances (multiple locations visited or an already validated assay), qPCR was a bit more cost-effective than metabarcoding for detecting tidewater gobies. Under all assumptions, seining was the least cost-effective method for detecting tidewater gobies or other fishes. Metabarcoding was the most cost-effective sampling method for multiple species detection. Despite its advantages, metabarcoding still suffers from gaps in sequence databases, can yield vague results for some species, and can lead novices to serious errors. Seining is still the only way to rapidly assess densities, size distributions, and fine-scale spatial distributions. The manuscript relies on 8 separate data sets and an R file to analyze them.  Each data file has an accompanying metadata file and information file. The subset of data used is provided in the data archive (Schmelzle&Kinziger_occupancy.csv) so that analyses can be reproduced but should be cited as Schmelzle, Molly C.; Kinziger, Andrew P. (2015). Data from: Using occupancy modeling to compare environmental DNA to traditional field methods for regional-scale monitoring of an endangered aquatic species [Dataset]. Dryad. https://doi.org/10.5061/dryad.6rs23

README: Tidewater goby and estuarine fish records from seining, qPCR and metabarcoding data for Southern California Estuaries in 2023

https://doi.org/10.25349/D9P60T

This data archive includes R code and data for reproducing the analyses and figures in Lafferty, Metabarcoding is (usually) more cost effective than seining or qPCR for detecting tidewater gobies and other estuarine fishes.
 
To view the supplementary tables, open the Fig&TableSuppl.docx file. This file also includes the manuscript figures and tables and some explanatory text about how to generate them.

To reproduce the figures, open the Fig&TableCode.Rmd in R studio and be sure the needed csv files included in the Dryad repository are in the working directory. The data files include more information than used in the analyses and can be used for other purposes. The code is not software, nor is it intended as an R package, but the code is annotated so others can understand and manipulate it.

For each CSV file there is an associated metadata file that defines entries and columns and an information file that contains an abstract and ownership information.

One of the data files required to reproduce the analyses (Schmelzle&Kinziger_occupancy.csv) was created from previously published data and was not produced by the author.  Please cite it as Schmelzle, Molly C.; Kinziger, Andrew P. (2015). Data from: Using occupancy modeling to compare environmental DNA to traditional field methods for regional-scale monitoring of an endangered aquatic species [Dataset]. Dryad. https://doi.org/10.5061/dryad.6rs23

 

Files included in the archive

Name Type Description
Fig&TableCode.Rmd R markdown Code used to produce analyses and figures from imported data
Fig&TableSuppl.docx Document Supplementary tables
DataToResampleMetabarcodesR_Entity_and_Attribute_Metadata.csv Metadata definitions and units
DataToResampleMetabarcodesR_Identification_Metadata.csv Information location, date and ownership
DataToResampleMetabarcodesR.csv original data A matrix of species presence absence data by sample.
DataToResampleSeinesR_Entity_and_Attribute_Metadata.csv Metadata definitions and units
DataToResampleSeinesR_Identification_Metadata.csv Information location, date and ownership
DataToResampleSeinesR.csv original data A matrix of species presence absence data by sample.
Estuaries2022_MiFishU_metadata_Entity_and_Attribute_Metadata.csv Metadata definitions and units
Estuaries2022_MiFishU_metadata_Identification_Metadata.csv Information location, date and ownership
Estuaries2022-MiFishU-metadata.csv original data attributes of the sampling sites
Final_Matches_esv_data_Entity_and_Attribute_Metadata.csv Metadata definitions and units
Final_Matches_esv_data_Identification_Metadata.csv Information location, date and ownership
Final_Matches-esv.data.csv original data Consensus taxonomy for metabarcoding data
InvestmentTable_Entity_and_Attribute_Metadata.csv Metadata definitions and units
InvestmentTable_Identification_Metadata.csv Information location, date and ownership
InvestmentTable.csv original data Estimates for efforts needed per method
JVB1846_qpcr_tabulated_data_Entity_and_Attribute_Metadata.csv Metadata definitions and units
JVB1846_qpcr_tabulated_data_Identification_Metadata.csv Information location, date and ownership
JVB1846-qpcr-tabulated-data.csv original data tidewater goby qPCR reads per sample
New_Estuary_Data_Entity_and_Attribute_Metadata.csv Metadata definitions and units
New_Estuary_Data_Identification_Metadata.csv Information location, date and ownership
New_Estuary_Data.csv original data metabarcoding data with consensus taxonomy per sample
Schmelzle&Kinziger_occupancy.csv Schmelzle et al. 2015Dryad. 6rs23 published tidewater goby occupancy data by qPCR and seining
SchmelzleandsKinziger_occupancy_Entity_and_Attribute_Metadata.csv Metadata definitions and units
SchmelzleandsKinziger_occupancy_Identification_Metadata.csv Information location, date and ownership

 

 

DataToResampleMetabarcodesR.csv

Abstract

These presence/absence data on fish species detections from metabarcoding at Calleguas Creek are organized as a matrix with samples/sites as rows and species as columns.  An entry of 1 means the species was detected at that site/sample.  A 0 means it was not detected.  The matrix was developed by summarizing information in the New_Estuary_Data.csv file.

 

Purpose

The data are resampled to plot the cumulative richness detected from a particular sampling effort.  Doing this several times makes it possible to plot confidence limits around the estimate.

 

 

 

 

DataToResampleSeinesR.csv

Abstract

These presence/absence data on fish species detections from seining at Calleguas Creek are organized as a matrix with samples/sites as rows and species as columns.  An entry of 1 means the species was detected at that site/sample.  A 0 means it was not detected.  The matrix was developed by summarizing information in the Estuaries2022-MiFishU-metadata.csv file.

 

Purpose

The data are resampled to plot the cumulative richness detected from a particular sampling effort.  Doing this several times makes it possible to plot confidence limits around the estimate.

 

 

Estuaries2022-MiFishU-metadata.csv

Abstract

Field notes including site attributes like the area sampled, temperature, species observations, and locations. The data also include the label given to eDNA filters, site names, and the qPCR reads per replicate for tidewater gobies and the fish detected by seining.

 

Purpose

The data are used to create various data suitable for specific analyses, like DataToResampleSeinesR.csv.

 

 

Final_Matches-esv.data.csv

Abstract

Metabarcode data offer several hypothesized taxa for each ASV in a sample. To convert these data into a single taxon, a consensus taxonomy is required. Here, one taxon_name is given per ASV, a second column indicates if there are alternative likely names.

 

Purpose

The consensus taxonomy can be used to evaluate the biodiversity captured in a metabarcode sample. This is used here to create the DataToResampleMetabarcodesR.csv

file which, in turn, is used to generate species accumulation curves for metabarcoding.

 

InvestmentTable.csv

Abstract

Sampling requires various efforts or investments. This table defines those actions based on author experience and then estimates how they vary among different sampling approaches. Using the common currency of technician hours makes it possible to add up various efforts into a single number.

Purpose

By obtaining a single value for effort, it is possible to plot return on investment for different methods.

 

JVB1846-qpcr-tabulated-data.csv

Abstract

qPCR detection of tidewater gobies with a species-specific primer returns a number of amplicon reads in each of three qPCR replicates. 

Purpose

Doing qPCR detection in multiple samples and multiple replicates makes it possible to calculate detection probabilities and to then plot how detection increases with replication.

 

New_Estuary_Data.csv

Abstract

This file summarizes the consensus taxonomy for each metabarcoding sample per site per estuary along with the sum reads.    This file is derived by combining the Final_Matches-esv.data.csv and the Estuaries2022-MiFishU-metadata.csv. Note that these data include several estuaries in addition to Calleguas Creek

 

Purpose

These data are used to estimate how well tidewater gobies are detected by metabarcoding (for comparison with qPCR). They are also used to estimate the plausible list of taxa detected by metabarcoding.

 

Fig&TableCode.Rmd

Abstract

An R markdown file that has scripts to achieve various aims of the study.  Note that this file does not use the raw data directly, rather it uploads several CSV file included in in the supplementary data appendix.  These must be loaded in the working directory for the Markdown file to work. The markdown file can be used to produce the report Fig&TableSuppl.docx, which includes the figures and additional analyses.  Or it can be used to evaluate assumptions in the analyses or develop additional analyses. The Markdown file assumes a basic knowledge of the R language and a strong familiarity with the paper.

 

Purpose

The main purpose of this file was to analyze data and generate figures. The secondary purpose was to collate all analyses in a way that users could reproduce the analyses in the paper.  A third purpose is to provide scripts that others might find useful when working with qPCR, metabarcoding or seining data.

 

Fig&TableSuppl.docx

Abstract

A report in Word generated from Fig&TableCode.Rmd.  The report includes supplementary tables and outputs of the main figures (which may not be identically formatted to those in the journal publication).

 

Purpose

The main purpose of this file was to make the supplementary tables available, but also to guide those interested in working with the code in Fig&TableCode.Rmd.

 

Schmelzle&Kinziger_occupancy.csv

Abstract

This file is derived from previously published data compared seine detections of tidewater gobies with qPCR detections.  These data should be cite as Schmelzle et al. 2015Dryad. 6rs23.

Purpose

These data were used to compare qPCR with seining for detecting tidewater gobies.

Sharing/Access information

Files included in the archive

Name Type Description
TWgobyMethodsAppendix.Rmd R markdown Code used to produce analyses and figures from imported data
TWgobyMethodsAppendix.docx Document Supplementary tables
DataToResampleMetabarcodesR_Entity_and_Attribute_Metadata.csv Metadata definitions and units
DataToResampleMetabarcodesR_Identification_Metadata.csv Information location, date and ownership
DataToResampleMetabarcodesR.csv original data A matrix of species presence absence data by sample.
DataToResampleSeinesR_Entity_and_Attribute_Metadata.csv Metadata definitions and units
DataToResampleSeinesR_Identification_Metadata.csv Information location, date and ownership
DataToResampleSeinesR.csv original data A matrix of species presence absence data by sample.
Estuaries2022_MiFishU_metadata_Entity_and_Attribute_Metadata.csv Metadata definitions and units
Estuaries2022_MiFishU_metadata_Identification_Metadata.csv Information location, date and ownership
Estuaries2022-MiFishU-metadata.csv original data attributes of the sampling sites
Final_Matches_esv_data_Entity_and_Attribute_Metadata.csv Metadata definitions and units
Final_Matches_esv_data_Identification_Metadata.csv Information location, date and ownership
Final_Matches-esv.data.csv original data Consensus taxonomy for metabarcoding data
InvestmentTable_Entity_and_Attribute_Metadata.csv Metadata definitions and units
InvestmentTable_Identification_Metadata.csv Information location, date and ownership
InvestmentTable.csv original data Estimates for efforts needed per method
JVB1846_qpcr_tabulated_data_Entity_and_Attribute_Metadata.csv Metadata definitions and units
JVB1846_qpcr_tabulated_data_Identification_Metadata.csv Information location, date and ownership
JVB1846-qpcr-tabulated-data.csv original data tidewater goby qPCR reads per sample
New_Estuary_Data_Entity_and_Attribute_Metadata.csv Metadata definitions and units
New_Estuary_Data_Identification_Metadata.csv Information location, date and ownership
New_Estuary_Data.csv original data metabarcoding data with consensus taxonomy per sample
Schmelzle&Kinziger_occupancy.csv Schmelzle et al. 2015Dryad. 6rs23 published tidewater goby occupancy data by qPCR and seining
SchmelzleandsKinziger_occupancy_Entity_and_Attribute_Metadata.csv Metadata definitions and units
SchmelzleandsKinziger_occupancy_Identification_Metadata.csv Information location, date and ownership

Code/Software

Files were created from Excel and R scripts used to summarize and organize the raw data into tables more easily analyzed by the R markdown file.

Methods

Water samples were taken for eDNA at four sites known to contain tidewater gobies in the past: 16 samples Calleguas Creek (34 06' 42", 119 04' 54", Ventura County), 21 samples at Ormond Lagoon (34 8' 23", 119 11' 20", just west of NVBC), and 17 samples at Santa Clara River Mouth (34 4' 8", 119 15' 53", Ventura County). In addition, two samples were taken at Mugu Lagoon (34 5' 30", 119 7' 21", Ventura County) and 20 samples were taken at Devereux Slough (34 25' 4", 119 52' 27", Santa Barbara County). I used commercial aquatic eDNA kits from Jonah Ventures ® for US $90 each (this cost includes supplies, sequencing and bioinformatics). At all sites, nearshore water samples were taken for environmental DNA wearing latex gloves to reduce contamination with human DNA. Samples were then filtered through a 1-micron disk filter by pushing water through the filter with a 60cc luer-lock syringe until clogging (mean sample volume: 174 cc +/- 0.141 S.D.). Filter capsules were purged of water before filling with preservative (tris-EDTA) before refrigerating until they were express shipped back to Jonah Ventures ® for sequencing. Seines were taken at each water sampling site for one of the estuaries (Calleguas Creek) that was sampled for eDNA (tidewater gobies collection permit #PER0046428) and matched sites and effort. Seine hauls were 2.4 m wide by 6.4 m distance on average in 0.6 m water depth. Temperature was 21.2 C. conductivity was 32 (close to seawater), and DO (mg/L) was 8.3). Jonah Ventures’ ® methods for qPCR and metabarcoding are summarized as follows. After the samples were received, DNA was extracted using the DNeasy Blood & Tissue Kit. Three qPCR replicates were run for a tidewater-goby-specific primer (Schmelzle & Kinziger 2016). Metabarcoding was done for the mitochondrial 12S ribosomal RNA (rRNA) gene which was PCR amplified from each genomic DNA sample using the MiFishUF and MiFishUR primers with spacer regions. Amplicon size and PCR efficiency were visually inspected and then cleaned by incubating. A second round of PCR was performed to complete the sequencing library construct. Final indexed amplicons from each sample were cleaned and normalized using SequalPrep Normalization Plates and then pooled. Sample library pools were sent for sequencing on an Illumina NovaSeq 6000 (San Diego, CA) at the Texas A&M Agrilife Genomics and Bioinformatics Sequencing Core facility using the SP Reagent Kit v1.5 (500 cycles) (cat# 20028402). Raw sequence data were then demultiplexed using pheniqs v2.10, primers were removed with Cutadapt v3.4, and read pairs were merged, denoised and chimeras were removed with vsearch v2.15.2. Exact sequence variants observed more than 7 times were assigned with a custom best-hits algorithm and a reference database that combined Genbank and a Jonah Ventures voucher sequence record, following a consensus taxonomy with all for any taxonomic level with > 90% agreement across top hits. Raw sequences were vouchered through NCBI SRA (SRP# PRJNA924783).

Funding

United States Department of Defense, Award: Q2 DAR-Q N6923222MP001XY, Naval Base Ventura County - Point Mugu