Skip to main content
Dryad logo

Data and code to replicate: Diet analysis using generalized linear models derived from foraging processes using R package mvtweedie

Citation

Thorson, James T.; Arimitsu, Mayumi L.; Levi, Taal; Roffler, Gretchen H. (2021), Data and code to replicate: Diet analysis using generalized linear models derived from foraging processes using R package mvtweedie, Dryad, Dataset, https://doi.org/10.5061/dryad.08kprr53h

Abstract

Diet analysis integrates a wide variety of visual, chemical and biological identification of prey.  Samples are often treated as compositional data, where each prey is analyzed as a continuous percentage of the total.  However, analyzing compositional data results in analytical challenges, e.g., highly parameterized models or prior transformation of data.  Here, we present a novel approximation involving a Tweedie generalized linear model (GLM).  We first review how this approximation emerges from considering predator foraging as a thinned and marked point process (with marks representing prey species and individual prey size).  This derivation can motivate future theoretical and applied developments.  We then provide a practical tutorial for the Tweedie GLM using new package mvtweedie that extends capabilities of widely used packages in R (mgcv and ggplot2) by transforming output to calculate prey compositions.  We demonstrate this approach and software using two examples. Tufted puffins (Fratercula cirrhata) provisioning their chicks on a colony in the northern Gulf of Alaska show decadal prey switching among sand lance and prowfish (1980-2000) and then Pacific herring and capelin (2000-2020), while wolves (Canis lupus ligoni) in Southeast Alaska forage on mountain goats and marmots in northern uplands and marine mammals in seaward island coastlines. 

Usage Notes

File list 

Reproducible_script_R1.R

Wolf.csv

Seabird.csv

MDO.seabirdforagingarea.SST.csv

Description

Reproducible_script_R1.R – R script used to replicate all analysis and figures in main text and appendices.  See comments at top for directions prior to running. 

Wolf.csv -  CSV file containing four columns used in the wolf metabarcoding case-study in Fig. 3 of the main text:

  1. “Latitude” -- Latitude of scat sample in Degree-decimals;
  2. “Longitude” -- Longitude of scat sample;
  3. “group” -- prey taxonomic group used in analysis;
  4. “Response” -- metabarcoding read count used as response variable.

Seabird.csv -  CSV file containing three columns used in the seabird bill-load case-study in Fig. 2 of the main text:

  1. “Year” – Year AD for bill-load sample;
  2. “group” -- prey taxonomic group used in analysis;
  3. “Response” – bill-load count used as response variable.

MDO.seabirdforagingarea.SST.csv -  CSV file containing two additional columns used in the seabird bill-load case-study in Fig. 2 of the main text:

  1. “Year” – Year AD, including all Years used in Fig. 2;
  2. “SST_mean” – average sea surface temperature near Middleton Island;