Skip to main content
Dryad logo

Data from: Assessing the usefulness of Citizen Science Data for habitat suitability modelling: opportunistic reporting versus sampling based on a systematic protocol

Citation

Henckel, Laura et al. (2021), Data from: Assessing the usefulness of Citizen Science Data for habitat suitability modelling: opportunistic reporting versus sampling based on a systematic protocol, Dryad, Dataset, https://doi.org/10.5061/dryad.8w9ghx3jj

Abstract

Aim: To evaluate the potential of models based on opportunistic reporting (OR) compared to models based on data from a systematic protocol (SP) for modelling species distributions. We compared model performance for eight forest bird species with contrasting spatial distributions, habitat requirements, and rarity. Differences in the reporting of species were also assessed. Finally, we tested potential improvement of models when inferring high quality absences from OR based on questionnaires sent to observers.

Location: Both datasets cover the same large area (Sweden) and time period (2000 -2013).

Methods: Species distributions were modelled using logistic regression. Predictive performance of OR models to predict SP data were assessed based on AUC. We quantified the congruence in spatial predictions using Spearman’s rank correlation coefficient. We related these results to species characteristics and reporting behaviour of observers. We also assessed the gain in predictive performance of OR models by adding inferred absences. Finally, we investigated the potential impact of sampling bias in OR.

Results: For all species, and despite the sampling biases, results from OR overall agreed well with those of SP, for the nationwide spatial congruence of habitat suitability maps and the selection and directions of species-environment relationships. The OR models also performed well in predicting the SP data. The predictive performance of the OR models increased with species rarity and even outperformed the SP model for the rarest species. No significant impact of observer behaviour was found.

Main Conclusions: Relatively simple analyses with inferred absences could produce reliable spatial predictions of habitat suitability. This was especially true for rare species. OR data should be seen as a complement to SP, as the weakness of one is the strength of the other, and OR may be especially useful at large spatial scales or where no systematic data collection protocols exist.

Usage Notes

Grey-headed woodpecker_Opportunistic reporting: Presence observations and inferred absences of grey-headed woodpecker 2000-2013 (downloaded from https://analysisportal and processed according to Appendix S3) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Grey-headed woodpecker_Systematic protocol: Presence and absence observations of grey-headed woodpecker 2000-2013 (from http://www.fageltaxering.lu.se/ and processed according to Appendix S2) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Hazel grouse woodpecker_Opportunistic reporting: Presence observations and inferred absences of hazel grouse 2000-2013 (downloaded from https://analysisportal and processed according to Appendix S3) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Hazel grouse woodpecker_Systematic protocol: Presence and absence observations of hazel grouse 2000-2013 (from http://www.fageltaxering.lu.se/ and processed according to Appendix S2) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Lesser spotted woodpecker_Opportunistic reporting: Presence observations and inferred absences of lesser spotted woodpecker 2000-2013 (downloaded from https://analysisportal and processed according to Appendix S3) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Lesser spotted woodpecker_Systematic protocol: Presence and absence observations of lesser spotted woodpecker 2000-2013 (from http://www.fageltaxering.lu.se/ and processed according to Appendix S2) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Long-tailed tit_Opportunistic reporting: Presence observations and inferred absences of long-tailed tit 2000-2013 (downloaded from https://analysisportal and processed according to Appendix S3) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Long-tailed tit_Systematic protocol: Presence and absence observations of long-tailed tit 2000-2013 (from http://www.fageltaxering.lu.se/ and processed according to Appendix S2) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Red-breasted flycatcher_Opportunistic reporting: Presence observations and inferred absences of red-breasted flycatcher 2000-2013 (downloaded from https://analysisportal and processed according to Appendix S3) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Red-breasted flycatcher_Systematic protocol: Presence and absence observations of red-breasted flycatcher 2000-2013 (from http://www.fageltaxering.lu.se/ and processed according to Appendix S2) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Siberian jay_Opportunistic reporting: Presence observations and inferred absences of Siberian jay 2000-2013 (downloaded from https://analysisportal and processed according to Appendix S3) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Siberian jay_Systematic protocol: Presence and absence observations of Siberian jay 2000-2013 (from http://www.fageltaxering.lu.se/ and processed according to Appendix S2) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Siberian tit_Opportunistic reporting: Presence observations and inferred absences of Siberian tit 2000-2013 (downloaded from https://analysisportal and processed according to Appendix S3) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Siberian tit_Systematic protocol: Presence and absence observations of Siberian tit 2000-2013 (from http://www.fageltaxering.lu.se/ and processed according to Appendix S2) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Three-toed woodpecker_Opportunistic reporting: Presence observations and inferred absences of three-toed woodpecker 2000-2013 (downloaded from https://analysisportal and processed according to Appendix S3) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

Three-toed woodpecker_Systematic protocol: Presence and absence observations of three-toed woodpecker 2000-2013 (from http://www.fageltaxering.lu.se/ and processed according to Appendix S2) and environmental data computed in a 1km square buffer around each observation (see manuscript for details). Spatial reference for x and y coordinates: EPSG 3021.

 

For all species and both protocols (opportunistic reporting and systematically collected), environmental data include:

-the distance to the nearest city or village

 

In a 1km square buffer around each observation (see manuscript for details):

-the mean forest age (in year)

-the mean total volume of forest (in m3 /ha)

-the mean elevation (in m)

-the percentage of different tree species : beech, oak, birch, deciduous trees, spruce, pine (excluding contorta), contorta pine and coniferous trees

-climate data: spring and winter temperature (in °C), spring and winter precipitation (in mm)

-the percentage of forest

 

For opportunistic data only:

 

  • An estimation of the sampling effort, corresponding to the number of observations (ie. one observation per observer and date) or the number of birds (potentially several birds listed per observation) for the location and time period (see manuscript for details)

 

For systematic data only:

 

  • The number of time the transect have been sampled during the period (“MatchYear”), see manuscript for details

 

Funding

Svenska Forskningsrådet Formas