Skip to main content
Dryad logo

Using machine learning to model nontraditional spatial dependence in occupancy data

Citation

Mohankumar, Narmadha; Hefley, Trevor (2021), Using machine learning to model nontraditional spatial dependence in occupancy data, Dryad, Dataset, https://doi.org/10.5061/dryad.4xgxd259g

Abstract

Spatial models for occupancy data are used to estimate and map the true presence of a species, which may depend on biotic and abiotic factors as well as spatial autocorrelation. Traditionally researchers have accounted for spatial autocorrelation in occupancy data by using a correlated normally distributed site-level random effect, which might be incapable of modeling nontraditional spatial dependence such as discontinuities and abrupt transitions. Machine learning approaches have the potential to model nontraditional spatial dependence, but these approaches do not account for observer errors such as false absences. By combining the flexibility of Bayesian hierarchal modeling and machine learning approaches, we present a general framework to model occupancy data that accounts for both traditional and nontraditional spatial dependence as well as false absences. We demonstrate our framework using six synthetic occupancy data sets and two real data sets. Our results demonstrate how to model both traditional and nontraditional spatial dependence in occupancy data which enables a broader class of spatial occupancy models that can be used to improve predictive accuracy and model adequacy.

Methods

The file 'Serengeti.csv' includes Thomson’s gazelle data used in our study. The original data file is obtained from Hepler et al. (2018), who reported the presence and absence of Thomson’s gazelle at 195 sites within Serengeti National Park, Tanzania. The sites were sampled using a network of 179 motion-sensitive and thermally activated cameras.

The file 'Sugarglider.csv' includes sugar glider data that is used in our study. The original data file is obtained from Stojanovic (2019), who reported the presence and absence of sugar gliders. The data were collected during four or five site visits made to 100 sites in the Southern Forest region of Tasmania.

The zip file 'Serengeti.zip' includes the associated shapefile for the sampling grid in Serengeti National Park, Tanzania where Thomson’s gazelle data were collected.

The zip file 'Sugarglider.zip' includes the associated shapefile for the Southern Forest region of Tasmania where the sugar glider data were collected.

Funding

National Science Foundation, Award: DEB 1754491