Skip to main content

Data from: Changes in Invertebrate Food Web Structure Between High- and Low-productivity Environments are Driven by Intermediate but Not Top Predator Diet Shifts

Cite this dataset

Miller-ter Kuile, Ana et al. (2022). Data from: Changes in Invertebrate Food Web Structure Between High- and Low-productivity Environments are Driven by Intermediate but Not Top Predator Diet Shifts [Dataset]. Dryad.


Predator-prey interactions shape ecosystem stability and are influenced by changes in ecosystem productivity. However, because multiple biotic and abiotic drivers shape the trophic responses of predators to productivity, we often observe patterns, but not mechanisms, by which productivity drives food web structure. One way to capture mechanisms shaping trophic response is to quantify trophic interactions among multiple trophic groups and by using complementary metrics of trophic ecology. In this study, we combine two diet-tracing methods: diet DNA and stable isotopes, for two trophic groups (top predators and intermediate predators) in both low- and high-productivity habitats to elucidate where in the food chain trophic structure shifts in response to changes in underlying ecosystem productivity. We demonstrate that while top predators show increases in isotopic trophic position (δ15N) with productivity, neither their isotopic niche size nor their DNA diet composition changes. Conversely, intermediate predators show clear turnover in DNA diet composition towards a more predatory prey base in high-productivity habitats. Taking this multi-trophic approach highlights how predator identity shapes responses in predator-prey interactions across environments with different underlying productivity, building predictive power for understanding the outcomes of ongoing anthropogenic change.


These are data and code associated with the terrestrial top and intermediate predator diets from Palmyra Atoll (2009-2017). These data include both stable isotope and diet DNA metabarcoding data for top predators (Araneae: Heteropoda venatoria) and diet DNA metabarcoding data for intermediate predators (Araneae: Neoscona theisi, Scytodes longipes, Keija mneon). These datasets include the final compiled datasets for diet DNA data (originally from Miller-ter Kuile et al. 2022) and raw data for stable isotopes for top predators. The code includes code to reproduce all data cleaning, merging, statistical analyses, and visualizations. 


Field site and collections

We conducted this work on Palmyra Atoll National Wildlife Refuge, Northern Line Islands (5º53’ N, 162º05’W). Palmyra Atoll has a well-characterized species list, and like many atolls, is relatively species poor, allowing for detailed characterization of potential diet items (Handler et al. 2007). Predator individuals used for diet DNA and stable isotope analyses were collected across two habitat types representative of "high" and "low" productivity (Pisonia grandis and Cocos nucifera, respectively; Young et al. 2010).

Isotope sample collection:

All individuals were collected individually and frozen. Then organisms for which only isotope data were derived, we used individual body parts (usually legs, but sometimes other body parts that did not include digested material). We initially froze samples and then they were dried at 55 degrees C, and finally ground into a powder. We submitted samples to the University of California Davis, Stable Isotope Facility (SIF, Davis, California, USA), which processes samples on a PDZ Europa ANCA-GSL elemental analyzer interfaced to a PDZ Europa 20-20 isotope ratio mass spectrometer (Sercon Ltd., Cheshire, UK). We corrected raw δ15N values using a mixing model specific to baseline sources on Palmyra Atoll (Young et al. 2013). This correction meant subtracting a δ15N_base from δ15N_consumer, following an equation to account for multiple baseline sources (marine and terrestrial in this system): 

δ15N_base = (δ15N_plants * alpha + δ15N_marine * (1-alpha)) / Delta. 


alpha = (δ13C_consumer - δ13C_marine)/(δ13C_plant - δ13C_marine)


Delta = 0.0034

DNA sample collection:

For diet DNA metabarcoding samples, we used a combination of methods, including individual collection during visual surveys for understory, and soil collections and canopy fogging with insecticide onto collection sheets for canopy individuals. All individuals were collected individually with sterilized implements (ethanol-burned forceps) in sterilized collection containers containing 95% EtOH to avoid contamination  (Greenstone et al. 2011). All individuals were stored in 95% EtOH at -20ºC before DNA extraction. 

DNA extraction, PCR amplification, library preparation, sequencing, and denoising

We individually measured the length of each predator (mm) and separated the thorax, opisthosoma, or trunk (depending on predator species, (Krehenwinkel et al. 2017, Macías-Hernández et al. 2018)) for DNA extraction following a modified CTAB extraction protocol (Fulton et al. 1995). While most individuals were run in separate samples (70%, n = 121/173), some individuals were too small to extract ample DNA from only one individual (mean size of 4.04 ± 0.12 mm in total length), and so we combined these individuals with other individuals from the same species, size range (within ± 0.5 mm in length), and sampling period. For these combined samples, we aimed for a minimum total sample weight of 5mg, and ideal sample weights of 10-20mg, a range we had previously determined to be sufficient for downstream DNA extraction and cleaning protocols. This resulted in a maxiumum of 12 individuals in one sample (SI Figure 6). Following methods in (Krehenwinkel et al. 2017), we standardized concentrations of 40uL of each sample to 20ng/ul and used Ampure XP (Agencourt, Beverly, MA, USA) beads to remove higher molecular weight predator DNA prior to PCR steps. We then amplified the CO1 gene, which is well-represented in online databases (Porter and Hajibabaei 2018) with general metazoan primers (mlCOIintF/Fol-degen-rev; (Yu et al. 2012, Leray et al. 2013, Krehenwinkel et al. 2017)). We ran total reaction volumes per sample of 25μL, with 9μL nuclease free water, 12.5μL GoTaq Green Master Mix (Promega Corp., Madison, WI, USA), 1.25μL of each primer (at 10mM), and 1μL of DNA template (at 10ng/μL) and ran a duplicate for each sample. We followed a PCR protocol as follows: 3 minutes at 95ºC, 35 cycles of: 95ºC for 30 seconds, 46ºC for 30 seconds, 72ºC for one minute; ending with 72ºC for five minutes. We removed reaction dimer with Ampure XP beads at 0.8x bead-to-DNA ratio. We then attached Illumina index primers (Nextera XT Index Kit v2) with 5μL of PCR product per reaction and the recommended PCR protocol for these primers (Illumina 2009). We combined and cleaned successfully amplified duplicate samples using Ampure XP beads (0.7x beads-to-DNA) and diluted each sample to 5nM in 10mM TRIS, using 1uL of each sample for sequencing. For sequencing runs, we multiplexed all samples along with one negative control and two PCR4-TOPO TA vectors (Invitrogen, Carlsbad, CA, USA) containing the internal transcribed spacer 1 region from two fungal species as positive controls (GenBank accession numbers: MG840195 and MG840196;  (Toju et al. 2012, Clark et al. 2016, Apigo and Oono 2018)). We submitted multiplexed samples for sequencing at the University of California, Santa Barbara Biological Nanostructures Laboratory Genetics Core. Samples were run on an Illumina MiSeq platform (v2 chemistry, 500 cycles, paired-end reads) with a 15% spike-in of PhiX. Following sequencing, samples were demultiplexed using Illumina’s bcl2fastq conversion software (v2.20) at the Core facility.  

We merged, filtered (max ee  = 1.0), and denoised (clustered) our sequences around amplicon sequence variants (ASVs) using the DADA2 algorithm in R (dada2 package version; Callahan et al., 2016). Prior to denoising with DADA2, we used cutadapt (version 1.18, (Martin 2011)) to remove primers from each sequence. We removed samples from analysis that had not been sequenced to sufficient depth using iNEXT (Hsieh et al. 2016) and a lower quantile cutoff. We rarefied remaining samples (McKnight et al. 2019) based on the sample with the lowest sequencing depth which had been sequenced with 95%+ sampling completeness based on iNEXT (version 2.0.20) interpolation and extrapolation methods (Hsieh and Chao 2017). We rarefied using the rrarefy() function in the vegan (version 2.5.6) package in R to 15,954 reads per sample.

ASV taxonomic assignment with BLAST and BOLD

From the output of the DADA2 algorithm, we created a list of unique ASVs which we matched to taxonomies both in the GenBank and BOLD databases. For GenBank, we used BLAST (version 2.7.1) with the blastn command for taxonomic assignment of each ASV using the computing cluster at UC Santa Barbara, comparing against the GenBank nucleotide database with an evalue of 0.01 (downloaded on November 20, 2019). We visualized and exported taxonomic alignment using MEGAN Community Edition (version 6.18.0, (Huson et al. 2016)), using default settings and selecting the subtree with all possible diet items for this species (Kingdom: Animalia, Clade: Bilateria). For BOLD taxonomic assignment, we used the BOLD IDEngine of the CO1 gene with Species Level Barcode Records (accessed May 21, 2020; 4,070,029 Sequences, 225,114 Species, and 104,607 Interim Species in database) to match each ASV list to taxonomies. We combined taxonomic assignments from both programs and discarded taxonomic assignments that were mismatched at the family level or higher (Elbrecht et al. 2017). We chose to combine prey taxonomies at the order level by summing the cumulative read abundances across the ASVs that corresponded to each diet order in each sample. We corrected for potential sequence jumping (‘cross-talk’) across samples by removing reads across samples that emerged in negative controls (Oono et al. 2020) and all DNA matching any predator family present on an individual sequencing run was removed as a conservative method to account for potential sequence jumping (‘cross-talk’) (van der Valk et al. 2020). We verified ASV specificity based on positive control samples.

Data analyses:

To examine how stable isotope-based trophic niche shifts with environmental context, we calculated two common trophic niche metrics (standard ellipse area: Layman et al. 2012, kernel utilization density: Eckrich et al. 2019). We calculated the 95% confidence interval for both metrics and used a generalized linear model to examine how habitat context shapes isotopic niche space. We also examined how each isotopic signature (?13C and ?15N) shifted individually with environmental context using a set of linear mixed effects models. We used Gaussian error distributions for all linear models and random effects of islet and year to account for spatial and temporal non-independence. All models included abiotic context (categorical variable: high vs. low productivity) as fixed effects (n = 88 individuals from high-; 64 from low-productivity habitats).

To examine how diet DNA shifts with habitat context for both top and secondary predators, we determined shifts in DNA diet niche (beta diversity) between the two environmental contexts using distance based redundancy analyses (Jupke and Schafer 2020). We ran one model for each predator category (n = 21 and 13 individuals for the top predator species in high- and low-productivity habitat, respectively; n = 23 and 8 secondary predators from each habitat, respectively) and used the Jaccard dissimilarity index based on the presence-absence nature of our data. In the event of dissimilarity in diet composition with environmental context (p-value <= 0.05), we determined whether dissimilarity (beta diversity) was based on turnover (shifting to new prey items) or nestedness (one prey community is a subset of the other). 

We ran all statistical analyses in R (version 4.0.2; R Core Team 2020) and cleaned data with the here (version 1.0.1, Müller 2020) and tidyverse packages (version 1.3.0, Wickham et al. 2019). We computed isotopic niches using the rKIN package (version 0.1, Albeke 2017), ran mixed effects models in the glmmTMB package (version 1.1.2, Brooks et al. 2017), and ran model diagnostics using the DHARMa (version 0.3.3, Hartig 2020) and effects (version 4.2-0, Fox 2003) packages. We ran distance based redundancy analyses using the vegan (version 2.5-7, Oksanen et al. 2020) and betapart (version 1.5.4, Baselga et al. 2021) packages. 

Usage notes

Programs required to open the data files in this project include R computing language and any spreadsheet software that can open .csv files.

Related Dataset: 

Miller-ter Kuile, Ana et al. (2021), Data from: Predator-prey Interactions of Terrestrial Invertebrates are Determined by Predator Body Size and Species Identity, Dryad, Dataset,


National Science Foundation of Sri Lanka, Award: 0639185

National Science Foundation of Sri Lanka, Award: 1457371

National Geographic Society, Award: 8574-08

National Geographic Society, Award: 9698-15

University of California, Santa Barbara

Stanford University