Skip to main content
Dryad logo

Reliably predicting pollinator abundance: challenges of calibrating process-based ecological models

Citation

Gardner, Emma et al. (2020), Reliably predicting pollinator abundance: challenges of calibrating process-based ecological models, Dryad, Dataset, https://doi.org/10.5061/dryad.9cnp5hqfw

Abstract

1. Pollination is a key ecosystem service for global agriculture but evidence of pollinator population declines is growing. Reliable spatial modelling of pollinator abundance is essential if we are to identify areas at risk of pollination service deficit and effectively target resources to support pollinator populations. Many models exist which predict pollinator abundance but few have been calibrated against observational data from multiple habitats to ensure their predictions are accurate.

2. We selected the most advanced process-based pollinator abundance model available and calibrated it for bumblebees and solitary bees using survey data collected at 239 sites across Great Britain. We compared three versions of the model: one parameterised using estimates based on expert opinion, one where the parameters are calibrated using a purely data-driven approach and one where we allow the expert opinion estimates to inform the calibration process.

3. All three model versions showed significant agreement with the survey data, demonstrating this model's potential to reliably map pollinator abundance. However, there were significant differences between the nesting/floral attractiveness scores obtained by the two calibration methods and from the original expert opinion scores.

4. Our results highlight a key universal challenge of calibrating spatially-explicit, process-based ecological models. Notably, the desire to reliably represent complex ecological processes in finely mapped landscapes necessarily generates a large number of parameters, which are challenging to calibrate with ecological and geographical data that is often noisy, biased, asynchronous and sometimes inaccurate. Purely data-driven calibration can therefore result in unrealistic parameter values, despite appearing to improve model-data agreement over initial expert opinion estimates. We therefore advocate a combined approach where data-driven calibration and expert opinion are integrated into an iterative Delphi-like process, which simultaneously combines model calibration and credibility assessment. This may provide the best opportunity to obtain realistic parameter estimates and reliable model predictions for ecological systems with expert knowledge gaps and patchy ecological data.

Methods

TransectSurveyData.csv

This file consists of counts of bees observed along walked transects conducted between 2011 and 2016 at 239 sites across Great Britain. It is a processed composite dataset derived from multiple studies. The references for these original studies are listed in Table S1 of the supplementary material of Gardner et al. (2020; the publication in 'Methods in Ecology and Evolution' associated with this dataset) and section 2.1 in the main text of that publication details how the original observational data from those studies was processed to obtain this composite dataset. Briefly, where multiple transects were walked on a given site survey, we report the total transect length and total number of bees observed, where the bees are separated into four guilds (ground nesting bumblebees, tree nesting bumblebee, ground nesting solitary bees, cavity nesting solitary bees) according to the nesting preference of the species recorded (see Gardner et al. 2020 for details).

 

Model parameter csvs (attract.csv; av.csv; distances.csv; floralCover.csv; growth.csv; lfn.csv; poll_names.csv)

These files contain the model parameter values used to run the process-based pollinator model used in Gardner et al. (2020). The model itself is publicly available at https://github.com/yclough/ecodeal and is derived from Haussler et al. (2017) and Lonsdorf et al. (2009). See Gardner et al. (2020) for details of how these parameter values are derived and reference sources for their values. The parameter values contained in the files 'attract.csv' and 'floralCover.csv' are the values derived from the expert opinion questionnaire (nmax=10, see Gardner et al. 2020 for details).

Usage Notes

TransectSurveyData.csv

We strongly recommend using the original source datasets (see Gardner et al. 2020 for references), rather than our processed composite dataset, since these are far more detailed, not amalgamated to survey-level totals, generally include higher taxonomic resolution and often include ancillary data on survey weather conditions etc.

The column contents of the file are as follows:

1. Site - name of survey site

2. X - British National Grid x co-ordinate for survey site

3. Y - British National Grid y co-ordinate for survey site

4. Method - survey methodology (transect in all cases)

5. TransectDuration_min - total duration of transect survey in minutes. In many (but not all) cases, this is obtained by converting the reported transect length into a duration assuming a 50m transect takes 10min to survey.

6. Year - the year the survey was carried out

7. Week - the week of the year the survey was carried out, where this value ranges in integers between 1 and 52 and the week beginning 1st January is designated as week 1.

8. Day - the day of the year the survey was carried out, where this value ranges in integers between 1 and 365 and 1st January is designated as day 1.

9. GroundNestingBumblebees - the total number of ground nesting bumblebees observed during the survey

10. TreeNestingBumblebees - the total number of tree nesting bumblebees observed during the survey

11. GroundNestingSolitaryBees - the total number of ground nesting solitary bees observed during the survey

12. CavityNestingSolitaryBees - the total number of cavity nesting solitary bees observed during the survey

NB: Columns 9-12 can contain non-integer values. This occurs when bees were recorded as 'Bombus unknown' or 'solitary unknown'. In which case, they were apportioned between the nesting guilds according to the proportion of known species observed from each guild on that survey, as described in Gardner et al. (2020).

 

lfn.csv

This model input file lists the numeric code assigned to each landcover class (lu).

poll_names.csv

This model input file lists the species code assigned to each bee guild, its life history (solitary or social) and the floral periods (P1, P2, P3) during which its reproductive females (q) and workers (w; if social) are actively foraging.

av.csv

This model input file lists the maximum nest density used for each species code. Units = nests per ha.

distances.csv

This model input file lists the foraging and dispersal (activity = 'nesting') distances used for each species code. Units = m.

growth.csv

This model input file lists the growth parameters used for each species code. See Gardner et al. (2020) for parameter descriptions.

floralCover.csv

This model input file lists the expert opinion floral cover scores used for each landcover code in each floral period (P1, P2, P3). Columns with *_b represent mean scores across all experts. Columns with *_l represent mean value minus standard error. Columns with *_u represent mean value plus standard error. The permitted range for scores is 0.0 - 100.0. 

attract.csv

This model input file lists the expert opinion floral attractiveness scores (Flor_*) and nesting attractiveness scores (Nest_*) used for each species code with each landcover code in each floral period (P1, P2, P3). Columns with *_b represent mean scores across all experts. Columns with *_l represent mean value minus standard error. Columns with *_u represent mean value plus standard error. The permitted range for floral attractiveness scores is 0.0 - 20.0. The permitted range for nesting attractiveness scores is 0.0 - 1.0.

Funding

Global Food Security `Food System Resilience’ Programme, Award: BB/R00580X/1

Global Food Security `Food System Resilience’ Programme, Award: BB/R00580X/1