Skip to main content
Dryad logo

Positional errors in species distribution modelling are not overcome by the coarser grains of analysis

Citation

Gábor, Lukáš et al. (2022), Positional errors in species distribution modelling are not overcome by the coarser grains of analysis, Dryad, Dataset, https://doi.org/10.5061/dryad.79cnp5hx3

Abstract

The performance of species distribution models is known to be affected by the analysis grain and the positional error of species occurrences. Coarsening of the spatial analysis grain has been suggested to compensate for positional errors. Nevertheless, this way of dealing with positional errors has never been thoroughly tested. With increasing use of fine-scale environmental data in predictive models developed for conservation and climate change studies it is increasingly important to test this assumption. Species distribution models using fine-scale environmental data are more likely to be negatively affected by positional error as the inaccurate species occurrences might easier end up in unsuitable environment, which can result in inappropriate conservation actions.

Here, we examine the trade-offs between positional error and analysis grain and provide recommendations for best practice. We generated virtual species using tree canopy height, topography wetness index, and altitude derived from LiDAR point clouds at 5 x 5 m fine-resolution. We simulated the positional error in the range of 5 m to 99 m and evaluated the effects of several spatial grains in the range of 5 m to 500 m. In total, we assessed 49 combinations of positional accuracy and analysis grain. We used three common modelling techniques (MaxEnt, BRT and GLM) and four discrimination metrics to evaluate model performance (Sørensen index, overprediction and underprediction rate, AUC and TSS).

We found that model performance decreased with increasing positional error in species occurrences and coarsening of the analysis grain. Most importantly, we showed that coarsening the analysis grain to compensate for positional error did not improve model performance. Our results reject coarsening of the analysis grain as a solution to address the negative effects of positional error on model performance.

We recommend fitting models with the finest possible analysis grain (i.e., depending on data availablity) even when available species occurrences suffer from positional errors. If there are significant positional errors in species occurrence data, users are unlikely to benefit from making additional efforts to obtain higher resolution environmental data unless they also minimize the positional errors of species occurrences.

Methods

We used the virtualspecies package (ver. 1.5) in  the statistical software R (R Core Team 2021) to generate virtual species. To begin, we defined the response of virtual species to the environmental gradient at a resolution of 5 x 5 m (i.e. the finest resolution at which environmental variables were available). We used a normal distribution with the following parameters: (i) mean canopy height of 9 m and standard deviation of 4 m; (ii) mean altitude of 846 m and standard deviation of 100 m; and (iii) mean TWI of 8 and standard deviation of 0.4 m. These parameters allowed us to simulate virtual species with a narrow niche breadth as it has been suggested that SDMs of such species are more prone to positional error. We then multiplied the responses to obtain an environmental suitability raster. We applied the probabilistic approach (logistic function with α = −0.05 and β = 0.3) to convert the environmental suitability raster into probabilities of occurrences that were subsequently used to sample binary presence-absence rasters. We developed both presence-only and presence-absence models (see below), using 99 presence sites and 200 absence sites (i.e. sample prevalence of 0.33), and a uniform random distribution for sampling species presences and absences.

Funding

Internal Grant Agency of Faculty of Environmental Sciences, Award: 2020B0009