Replicate analysis from: Measuring complexity for hierarchical models using effective degrees of freedom
Data files
Apr 22, 2024 version files 5.42 MB
Abstract
Hierarchical models can express ecological dynamics using a combination of fixed and random effects, and measurement of their complexity (effective degrees of freedom, EDF) requires estimating how much random effects are shrunk towards a shared mean. Estimating EDF is helpful to (1) penalize complexity during model selection and (2) to improve understanding of model behavior. I apply the conditional Akaike Information Criterion (cAIC) to estimate EDF from the finite-difference approximation to the gradient of model predictions with respect to each datum. I confirm that this has similar behavior to widely used Bayesian criteria, and I illustrate ecological applications using three case studies. The first compares model parsimony with or without time-varying parameters when predicting density-dependent survival, where cAIC favors time-varying demographic parameters more than conventional AIC. The second estimates EDF in a phylogenetic structural equation model, and identifies a larger EDF when predicting longevity than mortality rates in fishes. The third compares EDF for a species distribution model (SDM) fitted for twenty bird species and identifies those species requiring more model complexity. These highlight the ecological and statistical insight from comparing EDF among experimental units, models, and data partitions, using an approach that can broadly adopted for nonlinear ecological models.
README: Replicate analysis from: Measuring complexity for hierarchical models using effective degrees of freedom
https://doi.org/10.5061/dryad.tmpg4f54z
This Open Science archive contains data and scripts to replicate one simulation and three case-study explorations of a generic approach to estimate effective degrees of freedom for widely-used ecological models.
Description of the data and file structure:
This includes four R scripts that can each be run separately. It also includes three directories:
* R for shared R-functions;
* TMB for TMB scripts that specify hierarchical models,
* Data for data used in case-study demonstrations.
Sharing/Access information:
* The case study involving fish life-history is includes two files:
* fish_traits.csv is extracted from Mlifehist_ver1.0.csv, which is from the Then et al. (2015) natural mortality database and copied from https://www.vims.edu/research/departments/fisheries/programs/mort_db/, but replaced blank cells are replaced with NA (as requested by Dryad). It has columns logK (log-von Bertalanffy growth rate), logLinf (log-asymptotic length), logM (log instantaneous natural mortality rate per year), and logtmax (log maximum age in years). See Then et al. (2015) for details.
* fish_phylogeny.tre is a dated phylogenetic tree including fishes and sharks, reduced to species in fish_traits.csv
* The case study involving bird densities includes three files:
* NDVI.tif is the Normalized Difference Vegetation Index
* population_density.csv is human population density for different locations
* Top20_Samples.csv is sampled bird densities for twenty species from the the Breeding Bird Survey (Sauer et al. 1997)
* The case study involving fish stock-recruit information is from the RAM Legacy database (Ricard et al. 2012), with RAMLDB v4.495 (assessment data only) copied from https://zenodo.org/records/4824192, then extracting the tab timeseries_values_views, extracting for stocklong=="Herring ICES 3a-4-7d" and columns S (spawning biomass) and R (recruitment)
Code/Software:
All models are run in R version 4.3.2 (R core team 2023), but compiling and linking to TMB to fit hierarchical models. All models are run using R-package TMB (Kristensen et al. 2016), and the fish trait case study also uses R-package phylosem (Thorson and van der Bijl 2023).
Citations:
Kristensen, K., A. Nielsen, C. W. Berg, H. Skaug, and B. M. Bell. 2016. TMB: Automatic differentiation and Laplace approximation. Journal of Statistical Software 70:1–21.
R Core Team. 2023. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
Ricard, D., C. Minto, O. P. Jensen, and J. K. Baum. 2012. Examining the knowledge base and status of commercially exploited marine species with the RAM Legacy Stock Assessment Database. Fish and Fisheries 13:380–398.
Sauer, J. R., J. E. Hines, G. Gough, I. Thomas, and B. G. Peterjohn. 1997. The North American breeding bird survey results and analysis. Eastern Ecological Science Center, Laurel, MD.
Then, A. Y., J. M. Hoenig, N. G. Hall, and D. A. Hewitt. 2015. Evaluating the predictive performance of empirical estimators of natural mortality rate using information on over 200 fish species. ICES Journal of Marine Science 72:82–92.
Thorson, J. T., and W. van der Bijl. 2023. phylosem: A fast and simple R package for phylogenetic inference and trait imputation using phylogenetic structural equation models. Journal of Evolutionary Biology 36:1357–1364.
Methods
See README for details.