Model for evaluating seabirds preferences for hake offal in Patagonia
Data files
Jul 18, 2024 version files 41.62 KB
-
log_sea.xlsx
-
README.md
Abstract
This data set was used to build a model to evaluate the seabirds' preferences for hake offal derived from an artisanal fishery in Patagonia.
Main results are: The fishers’ main contribution to seabirds is through offering them the offal of hake catches. We observed that seabirds consumed hake liver 99% of the time, while they consumed stomach less frequently (24%). We identified that southern giant petrels and black-browed albatrosses consumed more liver, while kelp gulls ate more stomach. The liver comprises 51.6% fat, essential for high trophic level marine predators such as black-browed albatrosses.
README: Model for evaluating seabirds preferences for hake offal in Patagonia
Dataset Description
We investigated the probability of the seabird assemblage consumption on specific hake offal items, such as the gonad, liver, and stomach. For this study, we defined a seabird assemblage attending a fishing boat in a single sampling period as the total number of taxonomic seabird species, including their abundances. In each sampling period, we randomly threw the offal items one by one from the boat into the sea. Consumption was categorized as “0” if no seabirds consumed an item and “1” if one or more seabirds consumed an offal item. We conducted twenty-four sampling periods, totaling 1298 observations of item consumption. The binomial positive or null “offal consumption by seabirds” served as our response variable, and the explanatory variables included ‘types of offal’ (fixed factor), ‘seasons’ (fixed factor), ‘sampling periods’ (random factor), and ‘abundance of seabird assemblage’ (random factor).
This dataset (log_sea.xlsx
) contains observations related to the consumption of various items by a certain species. The dataset includes variables such as items
, seasons
, times
, abun
, and consum
which represent different aspects influencing consumption behavior.
Column Descriptions
consum
: Binary variable indicating consumption of seabirds (0 = not consumed, 1 = consumed).items
: Categorical variable indicating types of hake offal (gonad, liver, stomach).times
: Integer variable indicating the number of experimental surveys.abun
: Integer variable indicating total seabird abundance around a fishing boat.seasons
: Categorical variable indicating seasonality of samples (winter, spring, summer).
Data Preparation and Libraries Used
The following libraries were used for the analysis:
library("lme4")
library("nlme")
library("ggplot2")
library("MuMIn")
library("emmeans")
library("pROC")
library("ggeffects")
Models and Analysis
Logistic Regression Models
Several logistic regression models were created to determine the effect of different variables on consumption:
model_1 <- glmer(consum ~ items + seasons + (1 | times) + (1 | abun), data = log_sea, family = binomial)
model_2 <- glmer(consum ~ items + seasons + (1 | times), data = log_sea, family = binomial)
model_3 <- glm(consum ~ items + seasons, data = log_sea, family = binomial)
model_4 <- glmer(consum ~ items + (1 | times), data = log_sea, family = binomial)
model_5 <- glm(consum ~ items, data = log_sea, family = binomial)
model_6 <- glmer(consum ~ 1 + (1 | times), data = log_sea, family = binomial)
Model Selection and Evaluation
The models were evaluated using Akaike Information Criterion (AIC) and corrected AIC (AICc):
# Calculate AIC
aic_values <- data.frame(
Model = c("model_1", "model_2", "model_3", "model_4", "model_5", "model_6"),
AIC = c(AIC(model_1), AIC(model_2), AIC(model_3), AIC(model_4), AIC(model_5), AIC(model_6))
)
aic_values <- aic_values[order(aic_values$AIC), ]
print(aic_values)
# Calculate AICc
aic_values <- data.frame(
Model = c("model_1", "model_2", "model_3", "model_4", "model_5", "model_6"),
AICc = c(AICc(model_1), AICc(model_2), AICc(model_3), AICc(model_4), AICc(model_5), AICc(model_6))
)
aic_values <- aic_values[order(aic_values$AICc), ]
aic_values$Delta_AICc <- aic_values$AICc - min(aic_values$AICc)
aic_values$Akaike_Weight <- exp(-0.5 * aic_values$Delta_AICc) / sum(exp(-0.5 * aic_values$Delta_AICc))
aic_values$Cumulative_Akaike_Weight <- cumsum(aic_values$Akaike_Weight)
print(aic_values)
Final Model Selection
The final model selected based on AIC and AICc values is model_4
:
# Final model
model_4 <- glmer(consum ~ items + (1 | times), data = log_sea, family = binomial)
summary(model_4)
anova(model_2, model_4, test = "Chi")
Post-hoc Analysis
Post-hoc analysis was conducted using the emmeans
package:
lsmeans(model_4, pairwise ~ items)
emmeans(model_4, pairwise ~ items, adjust = "tukey")
Model Fit and Visualization
The model fit was evaluated using ROC curves and predicted probabilities:
# ROC curve
roc_curve <- roc(log_sea$consum, predict(model_4, type = "response"))
plot(roc_curve)
# Predicted probabilities
ggpredict(model_4, c("items"), type = "fe")
df <- data.frame(
items = c("gonads", "liver", "stomach"),
predicted = c(0.55, 1.00, 0.24),
lower_ci = c(0.40, 0.99, 0.15),
upper_ci = c(0.68, 0.99, 0.37)
)
ggplot(df, aes(x = items, y = predicted)) +
geom_bar(stat = "identity", fill = "blue") +
geom_errorbar(aes(ymin = lower_ci, ymax = upper_ci), width = 0.4) +
labs(x = "Item", y = "Probabilidad predicha", title = "Probabilidades predichas de consumo") +
theme_classic()
Conclusion
The logistic regression models help determine the significant effects of different items on consumption, with model_4
being the best-fitting model based on AIC and AICc values.