Skip to main content

Data from: Improving inferences and predictions of species environmental responses with occupancy data

Cite this dataset

Morán-López, Teresa; Ruiz-Suarez, Sofia; Aldabe, Joaquín; Morales, Juan Manuel (2022). Data from: Improving inferences and predictions of species environmental responses with occupancy data [Dataset]. Dryad.


Occupancy models represent a useful tool to estimate species distribution throughout the landscape. Among them, MacKenzie et al.’s model (2002, MC), is frequently used to infer species environmental responses. However, the assumption that detection probability is homogeneous or fully explained by covariates may limit its performance. Species should be more easily observed at sites with a higher number of individuals. We simulated data following Royle and Nichols (2003) occupancy model (RN) that accounts for abundance-driven heterogeneous detection and two variants with overdispersion in the detection probability and local abundances. Then, we compared the performance of the MC model against that of RN.

 In addition to model misspecifications, insufficient information in data (i.e. infrequent detections) can limit our ability to detect existing effects with affordable sampling designs. To deal with this source of error, we extended RN approach to a community-level joint species model (RN-JSM), where species responses and detectability depended on their traits and phylogeny. Then, we tested RN-JSM performance in simulated and out-of-sample field data.

High abundance-driven heterogeneity in detection (i.e. common and secretive species) limited the ability of the MC model to quantify covariate effects; especially, when the number of visits was low. Both models (MC and RN), often failed to detect existing effects when data were overdispersed. Moreover, the RN model consistently lacked sufficient power when analyzing data from uncommon species (even when simulations and model specifications perfectly matched). This problem was solved by our RN-JSM, which yielded more precise and accurate estimates of species environmental responses. Increased accuracy in rare species held when the RN-JSM was tested with real and out-of-sample datasets.

In the light of our results, we propose: (i) for common and secretive species analyze occupancy data with the RN model and prioritize revisiting sites; (ii) for species that may have overdispersed detectability or local abundances (e.g. with correlated behaviors or occurring in clusters), apply RN extensions that account for this extra variation (e.g. Poisson-beta or zero-inflated models). Finally, (iii) for uncommon species (mean abundances < 1), whenever possible, gather data at the community level and apply joint-species modeling techniques.


Dataset is divided in three different sections: 

1) Literature search of field studies using occupancy models to estimate species responses to covariate effects. We performed our search using web of knowledge database. For a full description please read Appendix A of the paper.

2) MC_RN_single_spp_Simulations.R -- Scripts for simulating data following a Royle-Nichols (2003) approach (RN) in which site-specific species detectabiltiy depends on local abundances and two variants with overdispersed local abundances and detectability of individuals. Scripts also contain fit of models (and error calculation) for Royle and Nichols (2003), McKenzie (2002) and a McKenzie variant in which species detectability is modeled as a function of local covariates. 

RN_single_JSM_Simulations.R-- Scripts of simulation of multiple species RN model and model fit according to RN and JSM-RN model.

3) Illustration of single-species Royle-Nichols (2003) and JSM-RN model performance with a dataset from bird communities in uruguay grasslands. A full description of data collection can be found in section 3.2 of  material and methods section.

In the case of RN we also performed simulations of a community followirng a joint-species distribution approach. 

Usage notes

READ_ME.txt contains an overview of all files contained in this dataset

Description_Simulations.txt contains detailed information of scripts of comparisons of RN vs MC model performance as well as RN vs RN-JSM model performance in simulted datasets. 

Uruguay_grassland_data.txt contains detailed information about files related to field data analyses.


Ministry of Science, Technology and Productive Innovation, Award: PICT-2018-01566

United States Department of Agriculture

Southern Cone Grassland Alliance Program

Aves Uruguay

BirdLife international

Ministry of Science, Technology and Productive Innovation, Award: PICT-2015-0815