Predictive multi-scale occupancy models at range-wide extents: effects of habitat and human disturbance on distributions of wetland birds

Published Oct 01, 2020 on Dryad. https://doi.org/10.5061/dryad.2z34tmpgk

Abstract

Aim: Predicting distributions is fundamental to ecology, yet hindered by spatially-restricted sampling, scale-dependent relationships, and detection error associated with field surveys. Predictive species distribution models (SDMs) are nonetheless vital for conservation of many species. We developed a framework for building predictive SDMs with multi-scale data, and used it to develop range-wide breeding-season SDMs for 14 marsh bird species of concern.

Location: USA.

Methods: We built SDMs using data from range-wide surveys conducted over 14 years, and habitat and disturbance covariates measured at multiple spatial scales. We built hierarchical occupancy models that included heterogeneity in detectability during sampling, and used Bayesian model selection to regulate model complexity (covariates and scales) based explicitly on spatial predictive abilities. We thus integrated model selection for optimizing out-of-sample prediction, range-wide sampling over broad conditions, multi-scale analyses and scale-optimization, and species-specific detectability for a suite of wide-ranging species.

Results: Distributions of marsh birds were affected by local wetland conditions, but also by agricultural, urban, and hydrologic disturbances operating from local scales (100 – 500 m) to the watershed level. Variables measuring human disturbances improved prediction for most species, and every species was affected by attributes at > 1 scale. Five species showed evidence for continental-scale range contraction during the study.

Main conclusions: We demonstrate how hierarchical occupancy models can be optimized for prediction across a species’ range at the extent of a continent while also accounting for imperfect detection, and thus describe a generalizable approach that can be used for any species. We provide the first data-driven, empirical SDMs built at the range-wide extent for most of our 14 study species and demonstrate that previous studies focused on local distributions and the effects of fine-scale wetland vegetation missed important broad-scale drivers of occupancy for marsh birds.

###################################################################################
#Metadata file for data used to fit hierarchical multi-scale occupancy models for
#Stevens, B.S., & C.J. Conway. 2019. Predictive multi-scale occupancy models at
#range-wide extents: effects of habitat and human disturbance on distributions of
#wetland birds.
###################################################################################

###################################################################################
#Files are R objects that are lists containing all of the data used to fit models
#and obtain final posterior distributions for inferences as described in the
#manuscript text. Data for each individual species are indicated by a four letter
#species code as follows:
#
#pbgr = pied-billed grebe
#ambi = American bittern
#lebi = least bittern
#amco = American coot
#coga = common gallinule
#puga = purple gallinule
#limp = limpkin
#kira = king rail
#clra = clapper rail
#rwra = Ridgway's rail
#sora = sora
#vira = Virginia rail
#blra = black rail
#yera = yellow rail
#
#In addition, the spatial extent of covariate observations (i.e., measured using)
#moving window anaylses around each site as described in text) is indicated by
#the three digit number in the object name: 100, 224, and 500 for extents of
#100 m, 224 m, and 500 m, respectively. Note also that within a species all of the
#survey-level data (i.e., detection and non-detection records and detection
#covariates) are the same among these files, as the only thing that changes are
#the scales of the covariates hypothesized to affect occupancy probability.
#
#All of the R object files containing the data are structured identically among species,
#and all contain the following R objects stored inside the list:
#
#n = vector of the number of individual surveys conducted for each year of sampling.
#
#T = integer representing the total number of years for which field data were collected.
#
#detection = matrix of 1's (detection) and 0's (non-detection) indicating the result of
#each individual marsh bird surveys (row) for each year of sampling (column). Thus
#data within a year go down a column, and the other survey-level variables described below
#are structured identically and contain data corresponding to these detection-non-detection
#records (i.e., same position in their respective matrices contain data relative to surveys in the
#same position of the detection matrix). For this and all of the other survey-level variables
#described below, the NA values are simply used as fillers to obtain a matrix of data, and
#were necessary because the sampling was unbalanced with different numbers of surveys (and sites)
#each year, as described in text.
#
#sites = matrix of site number identifiers for the locations where each individual survey (row) was
#conducted in each year (column).
#
#visit = matrix of visit numbers corresponding to each individual survey (row) by year (column).
#Thus, 1 indicates the first visit at that site, 2 indicates the second visit at that site during the
#same breeding season, etc.
#
#s.time = matrix of the time at the start of sampling (standardized) for each individual site visit (row)
#over each year (column).
#
#s.time.q = s.time variable squared (for quadratic effects).
#
#j.date = matrix of julian date (standardized) of each survey for each individual site visit (row)
#over each year (column).
#
#j.date.sq = j.date variable squared (for quadratic effect).
#
#bcl = matrix of the length of the call broadcast sequence (standardized) used during each survey (row)
#over each year (column).
#
#call = matrix of 1's and 0's indicating if the call of the species in question was included in the
#specific call broadcast sequence for each survey (rows) over each year (column).
#
#occ.cov = matrix of covariate values (each covariate is a unique column) at each site (row) in
#ascending order of site numbers.Note that the matrix is structured the same for covariate date measured
#at each spatial extent, it's just that the spatial extent at which covariate measurements were made differ.
#
#Finally, note that each list itself has a name of the form:
#
#ambi100.model.data
#
#for example, considering data for the American bittern with covariates observed over a 100-m spatial extent.
###################################################################################

#Example R script for loading and showing the attributes of the list objects provided:

load("ambi.100.alldata.RData", .GlobalEnv)
str(ambi100.model.data)

Predictive multi-scale occupancy models at range-wide extents: effects of habitat and human disturbance on distributions of wetland birds

Data files

Abstract

Methods

Usage notes

Works referencing this dataset