Skip to main content
Dryad

Data from: Reliable species distributions are obtainable with sparse, patchy and biased data by leveraging over species and data types

Cite this dataset

Peel, Samantha L. et al. (2019). Data from: Reliable species distributions are obtainable with sparse, patchy and biased data by leveraging over species and data types [Dataset]. Dryad. https://doi.org/10.5061/dryad.2226v8m

Abstract

1. New methods for species distribution models (SDMs) utilise presence‐absence (PA) data to correct the sampling bias of presence‐only (PO) data in a spatial point process setting. These have been shown to improve species estimates when both data sets are large and dense. However, is a PA data set that is smaller and patchier than hitherto examined able to do the same? Furthermore, when both data sets are relatively small, is there enough information contained within them to produce a useful estimate of species’ distributions? These attributes are common in many applications. 2. A stochastic simulation was conducted to assess the ability of a pooled data SDM to estimate the distribution of species from increasingly sparser and patchier data sets. The simulated data sets were varied by changing the number of presence‐absence sample locations, the degree of patchiness of these locations, the number of PO observations, and the level of sampling bias within the PO observations. The performance of the pooled data SDM was compared to a PA SDM and a PO SDM to assess the strengths and limitations of each SDM. 3. The pooled data SDM successfully removed the sampling bias from the PO observations even when the presence‐absence data was sparse and patchy, and the PO observations formed the majority of the data. The pooled data SDM was, in general, more accurate and more precise than either the PA SDM or the PO SDM. All SDMs were more precise for the species responses than they were for the covariate coefficients. 4. The emerging SDM methodology that pools PO and PA data will facilitate more certainty around species’ distribution estimates, which in turn will allow more relevant and concise management and policy decisions to be enacted. This work shows that it is possible to achieve this result even in relatively data‐poor regions.

Usage notes