A comparison of biomonitoring methodologies for surf zone fish communities

Gold, Zachary 1

Research facility: University of California, Los Angeles

Published May 05, 2023 on Dryad. https://doi.org/10.5068/D1SQ4Z

Data files

May 05, 2023 version files 2.17 GB

anacapa_20230101.zip

116.30 MB
PB10152019-Dap551.zip

2.05 GB
README.md

19.67 KB

Abstract

Surf zones are highly dynamic marine ecosystems that are subject to increasing anthropogenic and climatic pressures, posing multiple challenges for biomonitoring. Traditional methods such as seines and hook and line surveys are often labor intensive, taxonomically biased, and can be physically hazardous. Emerging techniques, such as baited remote underwater video (BRUV) and environmental DNA (eDNA) are promising nondestructive tools for assessing marine biodiversity in surf zones of sandy beaches. Here we compare the relative performance of beach seines, BRUV, and eDNA in characterizing community composition of bony (teleost) and cartilaginous (elasmobranch) fishes of surf zones at 18 open coast sandy beaches in southern California. Seine and BRUV surveys captured overlapping, but distinct fish communities with 50% (18/36) of detected species shared. BRUV surveys more frequently detected larger species (e.g. sharks and rays) while seines more frequently detected one of the most abundant species, barred surfperch (Amphistichus argenteus). In contrast, eDNA metabarcoding captured 88.9% (32/36) of all fishes observed in seine and BRUV surveys plus 57 additional species, including 15 that frequent surf zone habitats. On average, eDNA detected over 5 times more species than BRUVs and 8 times more species than seine surveys at a given site. eDNA approaches also showed significantly higher sensitivity than seine and BRUV methods and more consistently detected 31 of the 32 (96.9%) jointly observed species across beaches. The four species detected by BRUV/seines, but not eDNA were only resolved at higher taxonomic ranks (e.g. Embiotocidae surfperches and Sygnathidae pipefishes). In frequent co-detection of species between methods limited comparisons of richness and abundance estimates, highlighting the challenge of comparing biomonitoring approaches. Despite potential for improvement, results overall demonstrate that eDNA can provide a cost-effective tool for long-term surf zone monitoring that complements data from seine and BRUV surveys, allowing more comprehensive surveys of vertebrate diversity in surf zone habitats.

Study Sites

To compare the effectiveness of seine, BRUV, and eDNA survey techniques for monitoring surf zone bony (teleost) and cartilaginous (elasmobranch) fish communities, we deployed the three survey techniques contemporaneously at 18 sandy beach sites across southern California, USA (Figure 1;Table S1); 14 on the California Channel Islands and 4 on the mainland. These represent novel fish community surveys for all but three of the mainland sites, providing important baseline data for fish assemblages. To maximize comparability, we surveyed surf zones using all three methods at each location on the same day using the methods described below. All surveys were conducted between August 15, 2018 and November 2, 2018. At one site, Soledad beach, on Santa Rosa Island, we were unable to conduct the BRUV surveys due to hazardous surf conditions.

Beach seine surveys

Beach seine surveys were employed using methods modified from the California Department of Fish and Wildlife (Monterey, CA, USA) (Carlisle, Schott & Abrahamson, 1960) using a 15.3 m (50 ft) x 1.8 m (6 ft) seine net (10 mm knotless nylon mesh, 2 m poles, 20 m leader ropes) with a bag, floats, and weighted lead line. At each site, we conducted seine hauls in the surf zone at four locations spaced haphazardly along the beach. For each seine haul, two researchers opened the beach seine parallel to shore in approximately 1.5 m of water. Keeping the weighted line flush with the bottom, we dragged the seine perpendicular to the shoreline until reaching the beach. Fish were then immediately removed from the seine, placed in aerated 1 m x 0.5 m x 0.5 m live wells, identified, enumerated, measured (total and standard length) on glazed (smooth) fish boards, and released alive at the site of capture in accordance University of California Santa Barbara’s Institutional Animal Care and Use (IACUC) protocol #943. Any fish that appear to be severely injured, moribund, or that did not recover from the stress of trapping were euthanized using Tricaine methanosulfonate (MS-222), a non-inhaled agents approved in the “AVMA Guidelines for the Euthanasia of Animals: 2013 Edition” for finfish [76].

Baited remote underwater video (BRUV) surveys

We conducted BRUV surveys following methods modified from Vargas-Fonseca et al.[77] and Borland et al. [25]. Each BRUV consisted of a GoPro Hero2 camera (GoPro Inc., San Mateo, California, USA, 2020) mounted on a five kg weight with a line and float attached for ease of retrieval. We then attached a bait bag containing ~152 g of frozen squid (Loligo sp.) to the weight with a PVC pipe, positioning it one meter in front of the camera. Snorkelers deployed three haphazardly spaced BRUV units along the outer edge of the surf zone at a depth of greater than two meters within two hours of low tide after conducting sein hauls, except for at sites where sufficient personnel allowed for concurrent sampling. We deployed each BRUV for one hour, producing three hours of video per beach. We reviewed videos to determine fish abundance, species richness, and community composition, using the MaxN statistic, the maximum number of individuals of one species in one frame during the hour-long footage [78].

Environmental DNA (eDNA) surveys

We collected three replicate 0.5 L samples of seawater (herein referred to as sample replicates) using sterile collapsible enteral feeding bags at each site. We then gravity filtered samples through 0.2 µm Sterivex filters following the methods of Gold et al. [79] (See Supplement for detailed methods description), storing filters at -20˚C prior to extraction via a modified Qiagen DNAeasy Blood and Tissue kit (Qiagen Inc., Gernmantown, MD, USA) [80]. Four field blanks consisting of 1 L of tap water were filtered in the field to serve as field and extraction controls. We amplified eDNA samples in triplicate using both 12S MiFish Universal teleost (MiFish-U) and elasmobranch (MiFish-E) primer sets [81], and then prepared sequencing libraries preparation followed Gold et al. [79] using Nextera Unique Dual Indices (Illumina, San Diego, CA, USA). Each unique PCR reaction included both positive and negative PCR controls; negative controls substituted molecular grade water in place of the DNA extraction, and we used either American alligator (Alligator mississippiensis) or dromedary (Camelus dromedarius), species non-native to California, for positive controls. In total, 2 positive controls, 5 PCR negative controls, and 4 field blanks were sequenced. We pooled all samples in equimolar concentrations by primer set except negative controls (5µL were added given no quantifiable DNA), resulting in a MiFish-U and a MiFish-E library which were separately sequenced on NextSeq PE 2 x 150 bp mid-output at the Technology Center for Genomics & Bioinformatics at the University of California – Los Angeles (UCLA) with 20% PhiX added to both sequencing runs.

Reference Barcode Generation

To supplement the California Current Large Marine Ecosystem reference database [79], we generated a MiFish 12S Universal Teleost specific reference sequence for white seabass (Atractoscion nobilis). Tissue was acquired from the California Current Cooperative Fisheries Investigation collections and extracted using a Qiagen DNAeasy Blood and Tissue kit (Qiagen Inc., Gernmantown, MD, USA). Tissue was amplified following the same PCR protocol for MiFish 12S Universal Teleost primer set described above. PCR products were purified using ExoSAP-IT (Affymetrix, Cleveland, OH, USA) and Sanger sequenced in both directions using BigDye chemistry (Applied Biosystems Inc, Foster City, CA, USA) at Laragen Inc., (Culver City, CA, USA) following Gold et al. [79].

eDNA bioinformatics

We processed the resulting eDNA metabarcoding sequences using the Anacapa Toolkit (version 1) (Curd et al., 2019), conducting quality control, amplicon sequence variant (ASV) parsing, and taxonomic assignment. Taxonomy was assigned using the Bayesian Lowest Common Ancestor classifier [82] and a curated reference database composed of fishes from the California Current Large Marine Ecosystem supplemented with the generated Atractoscion nobilis sequence following [79]; See detailed description in supplement). We processed each sequencing library twice using two different barcoding reference libraries. First, to assign taxonomy to marine mammalian and avian species, we used the CRUX-generated-12S database, comprised of reference barcodes for all publicly available 12S barcodes. Second, we used a curated metabarcoding database specific to California coastal marine fish to generate taxonomic assignments for fishes. We employed a Q score cutoff of 30 and Bayesian taxonomic cutoff score of 60 following the methods of Gold et al. [79]. The resulting taxonomic tables were transferred into R for further processing (R Core Team, 2020a). Reads from multiple ASVs with identical assigned taxonomic paths were summed together following the methods of Gold et al. [79]. For example, both ASV 1 and ASV 3 were assigned to Atherinops affinis with 1.5 million reads and 1.4 million reads respectively and were thus all reads assigned to both ASVs were summed.

We employed a multifaceted decontamination approach described in Gold et al. [68] developed by Kelly et al. [65] to remove field contamination, lab contamination, and index hopping [65,83–85]. Through the decontamination process we implemented a hierarchical site occupancy modeling framework to distinguish occupancy rates across multiple sample bottle and technical replicate detections [65]. Only ASVs detected in at least two technical replicates in a site were kept. From these processes, we obtained decontaminated eDNA species-by-sample community tables with counts of total sequence reads. We then summed the total reads of ASVs by assigned taxonomy including multiple ASVs from the two MiFish markers employed (e.g. summed 41 ASVs assigned to Black surfperch Embiotica jacksoni). From these processes, we obtained decontaminated eDNA species-by-sample community tables with counts of total sequence reads. We note that this approach does not use sequenced negative controls or field blanks to correct reads as previous work has demonstrated that this frequently removes the most abundant ASVs which arise as a result of index hopping [65].

We transformed the eDNA read counts into eDNA index scores according to Kelly et al. [65], which normalizes the read count per technical PCR replicate per species. This index was computed by first calculating the proportional relative abundance of each species in each technical PCR replicate. The relative abundance was then divided by the maximum relative abundance for a given species across all samples (e.g. highest observed abundance for Northern Anchovy is 15% of all total reads which serves as the denominator), yielding the eDNA index score, which ranges from 0 to 1 and allows for comparisons of relative abundance for specific taxa across samples. The goal of applying this transformation is to account for the effects of amplification bias across taxa [86–88].

We acknowledge that such an approach loses information about relative abundance between taxa in a sample. In particular, this approach likely over-inflates the abundance of rare taxa [88]. However, recent metabarcoding frameworks have highlighted that the compositional nature of metabarcoding [89] alongside species-specific amplification biases [87,90], impair our ability to make adequate inferences from raw sequence counts or proportions between taxa without ancillary information on either the underlying target taxa abundances or amplification efficiencies [87,90–93]. Thus, in the absence of such information, we follow the conservative methods advised by Kelly, Shelton, and Gallego [88] and employ the eDNA index, erring on the side of amplification efficiency bias being the largest contributor to observed differences in read counts across fish species from eDNA samples [72,87,94]

Data analysis

To explore the relative efficacy of seines, BRUV, and eDNA surveys for characterizing surf zone fish communities, we compared the total number of teleost and elasmobranch species identified by each method using the phyloseq (version 1.28.0) and vegan packages (version 2.5-7) [95,96] in R (version 3.6.1 [97]). Comparisons were made in two ways: 1) all detected fish taxa and 2) only surf zone fish taxa. Surf zone taxa were determined using habitat descriptions from FishBase.org and the literature [4,98–100] (Table S2). We determined and visualized the overlapping and unique fish species detected by each survey method across all 18 sites using the VennDiagram package (version 1.6.20) [101], comparing species richness of each method using Analysis of Variance (ANOVA) and post-hoc Tukey tests using the vegan package [96].

To examine survey method performance on a site-by-site basis, we calculated and compared the overlap of presence/absence site-species detections [97,102]. Here, we define a site-species detection as the detection of a species at a given site (e.g., Top smelt detected at Bechers Bay). Comparisons of site-species detections were conducted for both the surf zone fishes and all fishes, observed by seine, BRUV, and eDNA, respectively. We estimated sample coverage, the fraction of the total incidence probabilities of the discovered species for a set of sampling units, from rarefaction and extrapolation models for species richness (Hill number q=0) for each method using the iNext package (version 2.0.20)[103].

To determine whether the presence or absence of a species is a true reflection of biological reality or due to issues in the sampling process. [104,105], we also conducted a site-occupancy analysis of species detections at each site following the methods of Chambert et al. [85] as implemented by Kelly et al. [65,68]. We note this site occupancy analysis is separate from the method implemented in the decontamination process and was conducted on the final quality controlled data set. The binomial model yields the likelihood that a taxon detected is truly present in the sample. The model, implemented in Stan for R [104], depends upon three parameters: 1) the commonness of a taxon in the dataset (denoted Psi), 2) the probability of a detection given that the taxon is truly present (true positive; denoted P11), and 3) the probability of a detection given that the taxon was not truly present (false positive; denoted P10).

Probability of Ocurrence: (psi*p11^N*(1-p11)^(K-N)) / (psi*p11^N*(1-p11)^(K-N)+(1-psi)*p10^N*(1-p10)^(K-N)

Where K is the number of samples taken within a site and N is the number of species detections within a site (See Supplemental methods for detailed description). For each species we calculated the number of detections out of the number of replicate surveys taken at each site.

We emphasize that the true occupancy of any given species at any site is unknown. Here, we are estimating the true positive (P11) and false positive (P10) of species being present at a given site using detections across repeated sampling events. Importantly, we make the explicit assumption that any detection of a species by a method is a real detection of that species.

In addition to probability of occurrence we also calculated the mean sensitivity, the proportion of true positive detections correctly identified as positive using the following equation for each species:

Sensitivity = p11 / (p11+p10)

We also calculated the mean specificity, the proportion of true negative detections correctly identified as negative, using the following equation for each species:

Specifcity = 1-p10 / ((1-p10)+(1-p11))

We then compared the probability of occupancy, mean sensitivity, and mean specificity of each method across all species detected [106]. We further compared differences in the eDNA-derived probability of occurrence of surf zone and non-surf zone associated species to test if occupancy rates are a potential function of transport dynamics.

To analyze differences in the composition of surf zone fish detected among methods and across sites, we conducted a PERMANOVA and companion multivariate homogeneity of group dispersions on Jaccard-Binary dissimilarity indices based on presence/absence data using the adonis and betadisper functions in the vegan package [96]. The PERMANOVA was conducted using the following model:

Detection ~ Survey Method + Site.

We excluded the Soledad site on Santa Rosa Island given the lack of a BRUV survey. We further visualized community beta diversity among sampling methods using a constrained canonical analysis of principal components (CAP) through the vegan package [95,97].

Lastly, to assess the ability of eDNA to capture relative abundance, we compared mean eDNA index scores to both the average catch counts per seine as well as average MaxN counts per BRUV station using species-specific linear regressions. Similarly, we compared BRUV-derived average MaxN counts against average seine counts. We focused our analyses on species detected jointly by each method at three or more sites. We note that comparing uncorrected compositional results (eDNA metabarcoding data) to estimates of absolute abundance (BRUV MaxN and seine counts) is inherently flawed [31,87]. We present such results here to highlight the caveats of such approaches and discuss their merits (See Discussion).

A comparison of biomonitoring methodologies for surf zone fish communities

Data files

Abstract

Methods

Usage notes

Works referencing this dataset