Data from: Integrated species distribution models fitted in INLA are sensitive to mesh parameterisation
Data files
Apr 04, 2023 version files 27.50 KB
-
data_serotine1.RData
24.22 KB
-
README.md
3.28 KB
Abstract
The ever-growing popularity of citizen science, as well as recent technological and digital developments, have allowed the collection of data on species’ distributions at an extraordinary rate. In order to take advantage of these data, information of varying quantity and quality needs to be integrated. Point process models have been proposed as an elegant way to achieve this for estimates of species distributions. These models can be fitted efficiently using Bayesian methods based on integrated nested Laplace approximations (INLA) with stochastic partial differential equations (SPDE). This approach uses an efficient way to model spatial autocorrelation using a Gaussian random field and a triangular mesh over the spatial domain. The mesh is constructed by user-defined variables, so effectively represents a free parameter in the model. However, there is a lack of understanding about how to set these mesh parameters, and their effect on model performance. Here, we assess how mesh parameters affect predictions and model fit to estimate the distribution of the serotine bat, Eptesicus serotinus, in Great Britain. A Bayesian INLA model was fitted using five meshes of varying densities to a dataset comprising both structured observations from a national monitoring programme and opportunistic records. We demonstrate that mesh density impacted spatial predictions with a general loss of accuracy with increasing mesh coarseness. However, we also show that the finest mesh was unable to overcome spatial biases in the data. In addition, the magnitude of the covariate effects differed markedly between meshes. This confirms that mesh parameterisation is an important and delicate process with implications for model inference. We discuss how species distribution modellers might adapt their use of INLA in light of these findings.
Methods
We used two sources of Eptesicus serotinus data for our study, the first from the Field Survey which is part of UK’s Bat Conservation Trust’s (BCT) National Bat Monitoring Programme (NBMP). The Field Survey consists of a structured mobile acoustic survey where trained volunteers walk an approximately 3-kilometre-long transect within a randomly allocated 1-kilometre grid square. Counts of the number of bat passes are made at 12 points along the transect. For our study, we reduced the data to presences and absences (PA) per site (n = 666) because it is likely that the counts reflect bat activity (a combination of species abundance and time spent in the area) rather than true abundance due to their foraging behaviour. There is a risk of recording the same bat multiple times, which would add additional uncertainty to the analysis. The second dataset was from the National Biodiversity Network (NBN) Atlas which combines presence-only (PO) data from multiple sources. We excluded data that were not verified by expert verifiers, were from the NBMP Field Survey, and where coordinate uncertainty was more than 1 kilometre (remaining data n = 1374). For both datasets, data from 2005-2015 were used, which maximised the number of data points while assuming that the species’ range was stable over the chosen time period.
We chose the following environmental covariates for analysis of the impact of mesh dimensions: mean annual temperature (°C) averaged across our study period; percentage cover arable land, broadleaf woodland, and improved grassland. These covariates were chosen because E. serotinus roosting sites are known to be associated with arable land, improved grassland, and broadleaf woodland. Their foraging sites are generally determined by the habitat available to them around their roosting site, and they are able to exploit a large variety of habitats for foraging. All covariate values were scaled and centred (mean = 0, SD = 1). Empirical variograms were calculated for all four covariates to explore the spatial autocorrelation in each.