Skip to main content

Data for: Two is better than one: Coupling DNA metabarcoding and stable isotope analysis improves dietary characterizations for a riparian-obligate, migratory songbird

Cite this dataset

Hoenig, Brandon et al. (2022). Data for: Two is better than one: Coupling DNA metabarcoding and stable isotope analysis improves dietary characterizations for a riparian-obligate, migratory songbird [Dataset]. Dryad.


While an increasing number of studies are adopting molecular and chemical methods for dietary characterization, these studies often employ only one of these laboratory-based techniques; an approach which may yield an incomplete, or even biased, understanding of diet due to each method’s inherent limitations. To explore the utility of coupling molecular and chemical techniques for dietary characterizations, we applied DNA metabarcoding alongside stable isotope analysis to characterize the dietary niche of breeding Louisiana waterthrush (Parkesia motacilla), a migratory songbird hypothesized to preferentially provision their offspring with pollution-intolerant, aquatic arthropod prey. While DNA metabarcoding was unable to determine if waterthrush provision aquatic and terrestrial prey in different abundances, we found that specific aquatic taxa were more likely to be detected in successive seasons than their terrestrial counterparts, thus supporting the aquatic specialization hypothesis. Our isotopic analysis added greater context to this hypothesis by concluding that breeding waterthrush provisioned Ephemeroptera and Plecoptera, two pollution-intolerant, aquatic orders, in higher quantities than other prey groups, and expanded their functional trophic niche when such prey were not abundantly provisioned. Finally, we found that the dietary characterizations from each approach were often uncorrelated, indicating that the results gleaned from a diet study can be particularly sensitive to the applied methodologies. Our findings contribute to a growing body of work indicating the importance of high-quality, aquatic habitats for both consumers and their pollution-intolerant prey, while also demonstrating how the application of multiple, laboratory-based techniques can provide insights not offered by either technique alone.


Sample Collection 

Waterthrush nests were located by monitoring behavioral cues of adults on four headwater streams in the Laurel Highlands of Pennsylvania (Camp Run, Linn Run, Loyalhanna Creek, and Powdermill Run; see Mulvihill et al., 2008 for site descriptions) from mid-April to mid-July in the 2016 and 2017 breeding seasons. In both breeding seasons, nestling fecal samples were collected on up to two occasions and stored in 20 mL of absolute ethanol at -20°C until DNA extraction up to 3 months later (2016, n = 101; 2017, n = 72). In the 2017 breeding season, blood samples (n = 72) were collected from the brachial vein in 100 µL heparinized capillary tubes when nestlings were ~7 days post-hatch to ensure the nestlings were large enough to safely collect blood (>10g; Fair & Jones, 2010; McGuill & Rowan, 1989). Whole blood samples were kept chilled on ice packs (< 3 hours) until centrifugation to separate red blood cells from plasma, and red blood cells were kept frozen at -20°C until stable isotope analysis. Commonly detected dietary aquatic arthropod taxa were collected directly from Powdermill Run via D-net sampling in February 2020, while terrestrial arthropods were collected opportunistically from 2017 through 2019 using light traps stationed ~100 m from Powdermill Run. Aquatic arthropods were stored at -20°C while terrestrial arthropods were pinned and stored at room temperature before being prepared for stable isotope analysis. 

DNA-based Methods

DNA was extracted from individual nestling fecal samples following Trevelline et al. (2016), and the samples from individual nests were processed separately after thorough bleaching of equipment and bench surfaces to reduce the potential for contamination among nests (Trevelline, Nuttle, Hoenig, et al., 2018). Resulting total DNA extracts were subjected to a polymerase chain reaction (PCR) protocol following Hoenig, Trevelline, Nuttle, and Porter (2021) to amplify a hyper-variable region of the arthropod cytochrome c oxidase I gene using the general arthropod mini-barcoding primers ZBJ-ArtF1c and ZBJ-ArtR2c (Zeale, Butlin, Barker, Lees, & Jones, 2011); primers were first appended with 5’ adapter sequences complementary to the Nextera XT indexing primers required for sequencing on the Illumina MiSeq Platform (Trevelline et al., 2016). PCRs for each sample were repeated two additional times, and the resulting triplicate reactions were pooled and indexed following Illumina’s recommended indexing reaction protocol. Positive (Acroneuria carolinensis; Order: Plecoptera) and negative (DNA-free water) controls were included in all PCR batches and were pooled and indexed in the same manner as the nestling samples. Indexed amplicons were pooled to equimolar concentrations and sequenced on 250bp paired-end V2 Illumina MiSeq sequencing runs at the Genomics Facility of Biotechnology Resource Center at Cornell University (Ithaca, New York, USA; 2016 nestling samples) and the Department of Biological Sciences at Duquesne University (Pittsburgh, Pennsylvania, USA; 2017 nestling samples). Sequencing was performed with loading concentrations of 8pM (2016 samples) and 12pM (2017 samples) and included a 15% (2016 samples) or 20% (2017 samples) PhiX ‘spike-in’ to increase the complexity of the libraries and improve the quality of the sequencing runs. 

Resulting DNA sequencing reads were demultiplexed into separate FastQ files and only forward reads were retained for further analysis as the 250bp reads fully encompassed the DNA barcoding region and flanking primer sequences (~212bp). Using QIIME2 (Bolyen et al., 2019), sequencing reads were deposited into a shared file using the tools import function and forward and reverse primer sequences were removed using the cutadapt trim-single function; reads not containing both primer sequences in their entirety were removed from downstream analyses. The remaining sequencing reads were denoised and dereplicated using the DADA2 plug-in’s dada2 denoise-single function, leaving only unique, non-chimeric exact sequence variants (ESVs; Callahan et al., 2016). ESVs that appeared in fewer than two distinct samples (i.e. singletons) or were represented by fewer than 10 sequencing reads across all samples were removed to reduce the possibility of erroneous sequencing reads being retained in downstream analyses. A naïve Bayes classifier was trained on all North American arthropods within the Barcode of Life Data System (BOLD; Ratnasingham & Hebert, 2007) using the feature-classifier fit-classifier-naïve-bayes function. Using the classifier and the scikit classify-sklearn function with default arguments, ESVs were matched to the lowest possible taxonomic unit, and if an ESV matched multiple taxa with high confidence (e.g. two species), the lowest shared taxonomic classification was used to classify this sequence (i.e. genus-level); ESVs found in either the negative or positive control were removed from downstream analyses. Resulting taxonomic classifications were associated with the sequences and samples from which they were derived, and the community dataset was transformed to a presence-absence dataset. 

Stable Isotope Methods

Nestling blood samples were analyzed at the Cornell University Isotope Laboratory (COIL; Ithaca, New York, USA) where they were freeze-dried, ground, and homogenized before ~1mg of the sample was loaded into tin capsules. Representative samples from the five most commonly detected dietary prey groups in this study (e.g. Ephemeroptera, Plecoptera, Aquatic Diptera, Terrestrial Diptera and Lepidoptera; Fig. 1) were dried in open glass vials at 60°C for 72 hours, ground with a mortar and pestle, and sent to COIL before being loaded into tin capsules. Samples were analyzed for δ13C (an indicator of basal nutrient source; DeNiro & Epstein, 1978) and δ15N (an indicator of trophic position; DeNiro & Epstein, 1981) on a Thermo Scientific Delta V Advantage Isotope Ratio Mass Spectrometer coupled with a PN150 autosampler and a Carlo Erba NC2500 elemental analyzer. Isotope ratios were expressed in parts per mil deviations from each standard (‰) which were Vienna PeeDee Belemnite and atmospheric air for δ13C and δ15N, respectively. Percent composition of carbon and nitrogen were included and used in downstream isotopic mixing models (Pearson, Levey, Greenberg, & Del Rio, 2003; Phillips & Koch, 2002). COIL also analyzed deer tissue as an internal standard for quality assurance and reported the standard deviation for measurement as 0.06‰ and 0.07‰ for δ13C and δ15N, respectively. 

Prior to downstream mixing model analyses, nestling isotopic ratios were corrected with trophic discrimination factors (TDFs) to account for the difference in isotopic signature between predator and prey resulting from biochemical processes during nutrient assimilation (Macko, Estep, Engel, & Hare, 1986). To determine appropriate TDFs, we used Bayesian linear models (Goodrich, Gabry, Ali, & Brilleman, 2020; function stan_lm, package rstanarm, version 2.21.3) which included the isotope ratio of one element (i.e., δ13C or δ15N) as the response variable and sample type (i.e. nestling or arthropod) as the predictor variable. The TDFs for each element were defined as the slope coefficients from each model. Finally, to ensure that the findings of this study were robust and not heavily influenced by any single TDF value, we calculated 90% credible intervals around each slope coefficient and integrated these intervals into TDF estimates in downstream mixing model analysis. Models were run with four chains each with 2000 iterations, standard uniform priors specified by the stan_lm function, and the target average proposal acceptance probability set to 0.999 to eliminate divergent transitions. The TDF for δ13C was 2.35‰ [1.90‰ – 2.81‰], similar to the expected value based on the literature (Pearson et al., 2003; Post, 2002) and was used in all downstream analyses. The TDF for δ15N was 0.645‰ [0.012‰ – 1.278‰], about 2.7‰ lower than what is generally found in studies of birds (Post, 2002). However, all downstream analyses used the modeled δ15N TDF value for four reasons: 1) using a typical δ15N TDF value (3.4‰) suggested, impossibly, that the trophic level of nestling waterthrush was below that of their known dietary taxa; 2) in over 4100 nest site observations (B. Hoenig, unpublished data), very rarely (15 occurrences) did nestlings consume prey other than arthropods (i.e. salamanders), but never primary producers; 3) all nestling isotope values were within the minimum convex polygon outlined by the prey isotope values; and 4) the slopes for each sample type (i.e. nestling vs. arthropod) were statistically similar; all suggesting that shared trends in predator and prey isotopic signature were the result of prey consumption and not erroneous TDFs. Furthermore, studies have shown that ontogeny has an effect on δ15N TDFs, with younger, rapidly-developing birds exhibiting lower TDFs in controlled studies (Bearhop, Teece, Waldron, & Furness, 2000; Micklem, Connan, Stander, & McQuaid, 2021; Sears, Hatch, & O’Brien, 2009), suggesting that such an effect would be expected in this study of nestling songbirds.

Because the isotopes found in consumer tissues constitute a mixture of the isotopes derived from their prey (“animals are what they eat…”; DeNiro & Epstein, 1976), stable isotope mixing models can be used to assess the relative contributions of each prey source to a consumer’s assimilated diet (Phillips, 2001). Using the arthropod δ13C and δ15N values as sources, nestling values as mixtures, and the calculated TDFs and their credible intervals as mean correction factors and correction errors, respectively, we applied Bayesian stable isotope mixing models (BSIMMs) to determine the relative contribution of each potential prey group to the diets of all nestlings in the 2017 season (i.e. community-level), all nestlings found on the same stream (i.e. stream-level), and all nestlings within the same brood (i.e. nest-level). Because Ephemeroptera and Plecoptera had highly overlapping δ13C and δ15N values and share many life history traits (Brittain, 1990), they were grouped into ‘Ephemeroptera and Plecoptera’. BSIMMs were performed using the simmr_mcmc function within the SIMMR package (Parnell, Inger, Bearhop, & Jackson, 2010) with diffuse priors and varied iterations, burn-in, and thinning based on the groups of interest (community-wide, iterations = 1,000,000, burn-in = 10,000, thinning = 100; stream-wide, iterations = 1,000,000, burn-in = 100,000, thinning = 1,000; nest-wide, iterations = 100,000, burn-in = 10,000, thinning = 100). The standard ellipse area (a proxy for isotopic niche breadth; Newsome, Martinez del Rio, Bearhop, & Phillips, 2007) of each nest was calculated using the functions within the SIBER package (Jackson, Inger, Parnell, & Bearhop, 2011), which follows a similar Bayesian framework as SIMMR (diffuse priors, iterations = 100,000, burn-in = 1,000, thinning = 10). 

Statistical Analyses

All statistical analyses were performed in R (Version 4.1.0; R Core Team, 2021) and summary statistics are presented as the mean +/- one standard deviation. Frequency of occurrence (FOO) for the DNA-based methods are the result of dividing the number of diets in which a particular taxon was detected by the total number of diets sampled in a group of interest (e.g. all birds within a season). Tests of equal or given proportions (function prop.test package base) were performed to determine differences in FOO between breeding seasons, and permutational multivariate analysis (PERMANOVA; function adonis, package vegan; Oksanen et al., 2013) and non-metric multidimensional scaling (NMDS; function metaMDS, package vegan; Oksanen et al., 2013) using the Jaccard index (Jaccard, 1908) were used to identify annual differences in prey composition; one sample from the 2016 breeding season did not return any species-level prey classifications and was therefore removed from this analysis. For samples collected in 2017, stream-by-stream comparisons of family-level taxonomic richness of different prey groups were performed using an analysis of variance (ANOVA) for each major prey group followed by a post-hoc Tukey’s test to determine the degree to which streams differed. Similar comparisons using dietary contribution of each prey group from our isotopic analyses were performed by comparing probabilities of one prey group contributing more than the other being compared (function compare_groups package SIMMR; Parnell et al., 2010). We calculated the Shannon diversity index (Shannon, 1948) and the standardized Levin’s diversity index (Levins, 2020; Trevelline, Nuttle, Hoenig, et al., 2018) using all identified prey found in the samples collected from the 2017 breeding season. Using Bayesian linear models with standard uniform priors and target average proposal acceptance probability set to 0.999 (Goodrich, Gabry, Ali, & Brilleman, 2020; function stan_lm, package rstanarm, version 2.21.3), we compared these values to the mean isotopic niche breadth of each nest containing at least 3 nestling isotopic samples (n = 11). To determine the relationship between isotopic niche breadth and the relative contribution of the four common prey groups for each nest, we used Bayesian linear mixed effect models with default priors, the mean isotopic niche breadth as the response variable, the mean contribution estimate of each prey group as the predictor variable, and random intercepts among the streams (LMMs; Goodrich et al., 2020; function stan_lmer, package rstanarm, version 2.21.3). Effect sizes are presented with either 95% confidence intervals or 90% credible intervals for Bayesian statistics, and test values and p-values were calculated at 𝛼=0.05. 


American Ornithological Society, Award: Hesse Award

The National Aviary, Award: Avian Conservation Endowment