Data from: Improving inferences and predictions of species environmental responses with occupancy data
Data files
Apr 22, 2022 version files 70.22 KB
- 
              
                Description_Simulations.txt
                34.36 KB
- 
              
                field_sampling_designs_literature_search.xlsx
                34.36 KB
- 
              
                README.txt
                590 B
- 
              
                Uruguay_grassland_data.txt
                899 B
Abstract
Occupancy models represent a useful tool to estimate species distribution throughout the landscape. Among them, MacKenzie et al.’s model (2002, MC), is frequently used to infer species environmental responses. However, the assumption that detection probability is homogeneous or fully explained by covariates may limit its performance. Species should be more easily observed at sites with a higher number of individuals. We simulated data following Royle and Nichols (2003) occupancy model (RN) that accounts for abundance-driven heterogeneous detection and two variants with overdispersion in the detection probability and local abundances. Then, we compared the performance of the MC model against that of RN.
In addition to model misspecifications, insufficient information in data (i.e. infrequent detections) can limit our ability to detect existing effects with affordable sampling designs. To deal with this source of error, we extended RN approach to a community-level joint species model (RN-JSM), where species responses and detectability depended on their traits and phylogeny. Then, we tested RN-JSM performance in simulated and out-of-sample field data.
High abundance-driven heterogeneity in detection (i.e. common and secretive species) limited the ability of the MC model to quantify covariate effects; especially, when the number of visits was low. Both models (MC and RN), often failed to detect existing effects when data were overdispersed. Moreover, the RN model consistently lacked sufficient power when analyzing data from uncommon species (even when simulations and model specifications perfectly matched). This problem was solved by our RN-JSM, which yielded more precise and accurate estimates of species environmental responses. Increased accuracy in rare species held when the RN-JSM was tested with real and out-of-sample datasets.
In the light of our results, we propose: (i) for common and secretive species analyze occupancy data with the RN model and prioritize revisiting sites; (ii) for species that may have overdispersed detectability or local abundances (e.g. with correlated behaviors or occurring in clusters), apply RN extensions that account for this extra variation (e.g. Poisson-beta or zero-inflated models). Finally, (iii) for uncommon species (mean abundances < 1), whenever possible, gather data at the community level and apply joint-species modeling techniques.
Dataset is divided in three different sections:
1) Literature search of field studies using occupancy models to estimate species responses to covariate effects. We performed our search using web of knowledge database. For a full description please read Appendix A of the paper.
2) MC_RN_single_spp_Simulations.R -- Scripts for simulating data following a Royle-Nichols (2003) approach (RN) in which site-specific species detectabiltiy depends on local abundances and two variants with overdispersed local abundances and detectability of individuals. Scripts also contain fit of models (and error calculation) for Royle and Nichols (2003), McKenzie (2002) and a McKenzie variant in which species detectability is modeled as a function of local covariates.
RN_single_JSM_Simulations.R-- Scripts of simulation of multiple species RN model and model fit according to RN and JSM-RN model.
3) Illustration of single-species Royle-Nichols (2003) and JSM-RN model performance with a dataset from bird communities in uruguay grasslands. A full description of data collection can be found in section 3.2 of material and methods section.
In the case of RN we also performed simulations of a community followirng a joint-species distribution approach.
READ_ME.txt contains an overview of all files contained in this dataset
Description_Simulations.txt contains detailed information of scripts of comparisons of RN vs MC model performance as well as RN vs RN-JSM model performance in simulted datasets.
Uruguay_grassland_data.txt contains detailed information about files related to field data analyses.
