Skip to main content

Data from: Integrated species distribution models: combining presence-background data and site-occupany data with imperfect detection


Koshkina, Vira et al. (2018), Data from: Integrated species distribution models: combining presence-background data and site-occupany data with imperfect detection, Dryad, Dataset,


Two main sources of data for species distribution models (SDMs) are site-occupancy (SO) data from planned surveys, and presence-background (PB) data from opportunistic surveys and other sources. SO surveys give high quality data about presences and absences of the species in a particular area. However, due to their high cost, they often cover a smaller area relative to PB data, and are usually not representative of the geographic range of a species. In contrast, PB data is plentiful, covers a larger area, but is less reliable due to the lack of information on species absences, and is usually characterised by biased sampling. Here we present a new approach for species distribution modelling that integrates these two data types. We have used an inhomogeneous Poisson point process as the basis for constructing an integrated SDM that fits both PB and SO data simultaneously. It is the first implementation of an Integrated SO–PB Model which uses repeated survey occupancy data and also incorporates detection probability. The Integrated Model's performance was evaluated, using simulated data and compared to approaches using PB or SO data alone. It was found to be superior, improving the predictions of species spatial distributions, even when SO data is sparse and collected in a limited area. The Integrated Model was also found effective when environmental covariates were significantly correlated. Our method was demonstrated with real SO and PB data for the Yellow-bellied glider (Petaurus australis) in south-eastern Australia, with the predictive performance of the Integrated Model again found to be superior. PB models are known to produce biased estimates of species occupancy or abundance. The small sample size of SO datasets often results in poor out-of-sample predictions. Integrated models combine data from these two sources, providing superior predictions of species abundance compared to using either data source alone. Unlike conventional SDMs which have restrictive scale-dependence in their predictions, our Integrated Model is based on a point process model and has no such scale-dependency. It may be used for predictions of abundance at any spatial-scale while still maintaining the underlying relationship between abundance and area.

Usage notes