The distribution of wild bee species along a Latitudinal gradient in northern Europe depends on their flower preferences
Data files
May 05, 2025 version files 21.10 GB
-
BeePresenceAbsence.xlsx
12.43 MB
-
CovariateRasters.tif
16.41 GB
-
ELC10_Agg_10K.tif
30.67 KB
-
ELC10_Agg_10K.tif.aux.xml
126 B
-
ElevationCropped.tif
4.68 GB
-
PCA1map.tif
36.27 KB
-
PCA2map.tif
38.07 KB
-
Rasmussen2021_plants_visited.xlsx
57.90 KB
-
README.md
12.96 KB
Abstract
Aim
The functional diversity of bees contributes to the maintenance of plant biodiversity because different species of wild bees prefer and pollinate different plants. Many bees, in particular species with narrow flower preferences or specialized habitat requirements are threatened by landscape homogenization and climate change. Nonetheless, we still lack an understanding of large-scale impacts of anthropogenic stressors on the distribution of wild bee species with different flower preferences.
Location
Northern Europe: Norway, Denmark and Germany.
Methods
We combine a dataset comprising ~30 000 observations of presences or absences of bee occurrences from structured surveys at 269 sites in northern Europe to investigate if flower preferences modulate species distributions across multiple environmental gradients. Bees were assigned a continuous functional trait separating preference for short vs tubular flowers.
Results
We observe that bee flower preference for either tubular flowers (Fabaceae) or plants with shallow flowers (including Apiaceae and Brassicaceae) can be described by a continuous flower preference trait score. The likelihood of observing a bee along a latitudinal gradient – encompassing variation in temperature, atmospheric N deposition and elevation – is dependent on its flower preference trait score. Specifically, bees with preferences for tubular flowers has a higher likelihood of occurrence with higher latitudes, while bees with preference for non-tubular flowers increase towards the south.
Main conclusions
Our results improve our understanding of how species-specific variation in flower preferences drives community-wide shifts in diversity and can therefore help devise region-specific conservation strategies.
The dataset contains wild bee survey data from four sources: Sydenham et al. 2022a (32 sites) Sydenham el al. 2022b-c (115 sites), Sydenham et al. 2023 (31 sites) and Sydenham et al. 2024 (72 sites). The raw data from the cited publications were transformed to occurrence data (presence/ absence) used in the analysis of the paper of Torvanger et al., 2025 “The Distribution of Wild Bee Species Along a Latitudinal Gradient in Northern Europe Depends on Their Flower Preferences” published in Diversity and Distributions.
Using generalized linear mixed effect models, we observe that bee flower preference for either tubular flowers (Fabaceae) or plants with shallow flowers (including Apiaceae and Brassicaceae) can be described by a continuous flower preference trait score. The likelihood of observing a bee along a latitudinal gradient—encompassing variation in temperature, atmospheric N deposition and elevation—is dependent on its flower preference trait score. Specifically, bees with preferences for tubular flowers have a higher likelihood of occurrence with higher latitudes, while bees with preference for non-tubular flowers increase towards the south. How the analysis was carried out is found in the first script listed below.
The dataset also contains environmental data as tif files (raster layers), used as covariates in the GLMMs (CovariateRasters.tif, ELC10_Agg_10K.tif, ELC10_Agg_10K.tif.aux.xml, ElevationCropped.tif, PCA1map.tif, and PCA2map.tif) and a plant-bee network dataset gathered from Rasmussen et al., 2021 used to calculate the flower preference niche scores (DCA1) of each species. A more detailed description of each data file is found below.
Description of the data and file structure
BeePresenceAbsence.xlsx - contains the complete wild bee presence-absence dataset used in the analyses, containing 44 columns including environmental variables for all sites and functional traits of the wild bee species. Environmental variables extracted from each site using function extract from package terra in R (Hijmans 2024). Bioclimatic variables downloaded from the WorldClim database (Fick & Hijmans. 2017). Elevation data downloaded from the Global and European Digital Elevation Model (Copernicus DEM). Nitrogen deposition data downloaded from the MET Norway database. The environmental variables were combined on to two main axes of variation using PCA. They can be downloaded as raster maps found in the supplementary (“PCA1map.tif” and “PCA2map.tif”).
- Site - Site name
- Species - Currently valid scientific name of the specimen
- Latitude - Latitude coordinate of the site
- Longitude - Longitude coordinate of the site
- Method - Method of collection, three categories:
- Pan trap - the specimen was collected using a pan trap (sprayed in white, blue or yellow UV color)
- Netting - the specimen was collected using butterfly net along a 50 meter transect
- Point - the specimen was collected on one of two selected families of flowers: two plant tribes in Asteraceae and two tribes of Fabaceae
- Year - The year the specimen was collected
- Country - The country where the specimen was collected, three categories:
- NO - Norway
- DK - Denmark,
- GER - Germany
- datasetID - the name of the dataset
- Phenoscan2017 - dataset from Sydenham et al., 2022c
- PolliLand2019 - dataset from Sydenham et al., 2022a,b
- PolliLand2020 - dataset from Sydenham et al., 2022a,b
- PolliLand2021 - dataset from Sydenham et al., 2022a,b
- MetaComNet2021 - dataset from Sydenham et al., 2024
- PolliEffekt2022 - dataset from Sydenham et al., 2023
- Occurrence - whether the species is present (1) or absent (0) at the site
- WDEP_OXN - wet deposition of oxidised nitrogen deposition (N mg N/m2)
- WDEP_RDN - wet deposition of reduced nitrogen deposition (N mg N/m2)
- *DDEP_OXN_m2Grid -*dry deposition of oxidised nitrogen deposition (N mg N/m2)
- DDEP_RDN_m2Grid - dry deposition of reduced nitrogen deposition (N mg N/m2)
- ShannonLandscape - landscape diversity (Shannon index) within the 1 km radius around the site.
- UrbanCells - the proportion of landscape class: urban within 1 km of sampling locations according to the ELC10 map (Venter and Sydenham 2021).
- CropCells- the proportion of landscape class: cropland within 1 km of sampling locations according to the ELC10 map (Venter and Sydenham 2021).
- ForestCells- the proportion of landscape class: forest within 1 km of sampling locations according to the ELC10 map (Venter and Sydenham 2021).
- ShrubCells- the proportion of landscape class: shrubland within 1 km of sampling locations according to the ELC10 map (Venter and Sydenham 2021).
- GrassCells- the proportion of landscape class: grassland within 1 km of sampling locations according to the ELC10 map (Venter and Sydenham 2021).
- BarelandCells- the proportion of landscape class: bareland within 1 km of sampling locations according to the ELC10 map (Venter and Sydenham 2021).
- WaterCells- the proportion of landscape class: water within 1 km of sampling locations according to the ELC10 map (Venter and Sydenham 2021).
- WetlandCells- the proportion of landscape class: wetland within 1 km of sampling locations according to the ELC10 map (Venter and Sydenham 2021).
- bio1 - mean annual temperature
- bio12 - mean annual precipitation
- Total_Ndep - (N mg N/m2) - was calculated as the sum of reduced nitrogen deposition (=DDEP_RDN_m2Grid + WDEP_RDN) and oxidised nitrogen deposition (=DDEP_OXN_m2Grid + WDEP_OXN).
- Elevation - elevation (meters above sea level) of the site location
- DCA1 - The plant preference niche score of the wild bee species, extracted from the first axis from the detrended correspondence analysis (DCA) running the decorana function from package ‘vegan’ (Oksanen et al., 2022) on the plant-bee network dataset.
- DCA2 - second axis score from the detrended correspondence analysis (DCA).
- Family - the specimens family
- Sociality - whether the specimen is eusocial ('Eusocial'), solitary or communal ('Solitary') or unknown sociality ('NA')
- Lecty: - level of speciality in two categories: Oligolectic or Polylectic
- GroupTL - whether the specimen is a bumblebee with short tongue ('BombusST'), or a bumblebee with a long tongue ('BombusLT'), or not a bumblebee with a short tongue ('SolitaryST'), or not a bumblebee with a long tongue ('SolitaryLT').
- isBombus - if the specimen is a bumblebee (i.e., belongs to the genus Bombus) (TRUE) or not (FALSE)
- TotalOccurrenceSite - the number of occurrences (presences) of wild bee species at each site, i.e., the species richness at each site.
- PC1_climate, PC2_climate, PC3_climate, PC4_climate, PC5_climate - the PCA axes for the environmental data composition
- PC1_ELC10, PC2_ELC10, PC3_ELC10, PC4_ELC10, PC5_ELC10 - the PCA axes for the landscape composition
Rasmussen2021_plants_visited.xlsx - matrix dataset containing the plant-bee network gathered from Rasmussen et al., (2021) and Wood et al., (2021), used to calculate the plant preference trait score (DCA1) for the bee species. How the DCA analysis was carried out is found in the main script "S1_Flower_pref.R".
- The rows of this matrix consisted of different bee species with the columns consisting of plant families, with values recording the number of plant genera within each plant family the bee species have been observed to visit (Rasmussen et al. 2021; Wood et al. 2021).
- For example, Andrena humilis (Imhoff, 1832) has been recorded visiting seven genera within the family Compositae (Crepus, Hieracium, Hypochaeris, Leontondon, Pilosella, Taraxacum and Tragopogon) and was therefore assigned the score of 7 for
Compositae in the matrix.
CovariateRasters.tif - a stacked raster containing the covariates, which includes N deposition (MET Norway) and climate variables extracted from the Worldclim database (Fick & Hijmans. 2017).
ELC10_Agg_10K.tif - the land cover raster file - containing land cover data from ELC10 (Venter & Sydenham 2021) - was aggregated to 10km grids. We extracted all ELC10 raster values in a 1 km buffer around each site. How the values were extracted is found in a separate script "S2_ELC10_per_1K_buffer.R".
ELC10_Agg_10K.tif.aux.xml - the land cover raster file was aggregated to 10km grids. This file is a supplementary to "ELC10_Agg_10K.tif".
ElevationCropped.tif - elevation data downloaded from the Global and European Digital Elevation Model (Copernicus DEM), and cropped to the area of the study region.
PCA1map.tif - the environmental variables were combined onto two main axes of variation using PCA, and then rasterized. PCA1map contains the first PC axis. The PCA analysis is found in a separate script "S1_Flower_pref.R".
PCA2map.tif - the environmental variables were combined onto two main axes of variation using PCA, and then rasterized. PCA2map contains the second PC axis. The PCA analysis is found in a separate script "S1_Flower_pref.R".
Sharing/Access information
Data was derived from the following sources:
- Copernicus DEM. Global and European Digital Elevation Model (COP-DEM). EEA-10: European coverage.
https://spacedata.copernicus.eu/collections/copernicus-digital-elevation-model - Fick & Hijmans. 2017. S. E. Fick and R. J. Hijmans. WorldClim 2: new 1‐km spatial resolution climate surfaces for global land areas. International journal of climatology 2017 Vol. 37 Issue 12 Pages 4302-4315. https://www.worldclim.org/data/bioclim.html
- Hijmans R (2024). terra: Spatial Data Analysis. R package version 1.7-71, https://CRAN.R-project.org/package=terra - MET Norway. The Norwegian Meteorological institute (MET). EMEP MSC-W modelled air concentrations and depositions. https://emep.int/mscw/mscw_moddata.html.
- MET Norway. The Norwegian Meteorological institute. EMEP MSC-W modelled air concentrations and depositions. https://emep.int/mscw/mscw_moddata.html
- Oksanen et al., 2022. vegan: Community Ecology Package_. R package version 2.6-4.
- Rasmussen et al. 2021. Evaluating competition for forage plants between honey bees and wild bees in Denmark. Plos one 2021 Vol. 16 Issue 4 Pages e0250056.
- Sydenham et al. 2022 (a). Priority maps for pollinator habitat enhancement schemes in semi-natural grasslands. Landscape and Urban Planning 2022 Vol. 220 Pages 104354
- Sydenham et al. 2022 (b). High resolution prediction maps of solitary bee diversity can guide conservation measures. Landscape and Urban Planning 2022 Vol. 217 Pages 104267
- Sydenham et al. 2022 (c). Neutral processes related to regional bee commonness and dispersal distances are important predictors of plant–pollinator networks along gradients of climate and landscape conditions. Ecography 2022 Vol. 2022 Issue 12 Pages e06379.
- Sydenham et al. 2023. The contributions of flower strips to wild bee conservation in agricultural landscapes can be predicted using pollinator habitat suitability model. Ecological Solutions and Evidence 2023 Vol. 4 Issue 4 Pages e12283.
- Sydenham et al. 2024. Climatic conditions and landscape diversity predict plant-bee interactions and pollen deposition in bee-pollinated plants. Ecography. 2024(9), e07138.
- Venter & Sydenham 2021. Continental-scale land cover mapping at 10 m resolution over Europe (ELC10). Remote Sensing 2021 Vol. 13 Issue 12 Pages 2301
- Wood et al., 2021. Global patterns in bumble bee pollen collection show phylogenetic conservation of diet. Journal of Animal Ecology, 90(10), 2421-2430.
Code/Software
The scripts were created using R version 4.1.2
S1_Flower_pref.R - contains code for the modelling process and results (DCA, PCA, glmer) and figures used in the published paper. The occurrence data has been used to model predicted probability of wild bee occurrence along a latitudinal gradient in northern Europe, using generalized linear mixed effect models using package lme4 (Douglas Bates, Martin Maechler, Ben Bolker, Steve Walker (2015). Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1), 1-48. doi:10.18637/jss.v067.i01.) The model output is found in a separate script “Appendix_Table_glmer” and the model details in the main script “S1_Flower_pref”.
S2_ELC10_per_1K_buffer.R - contains code for extracting land cover data (ELC10 values). The script extracts ELC10 values in a 1 km buffer zone around each site.
We combined datasets from six different wild bee surveys conducted within northern Europe. The combined dataset consisted of a total of 269 sites sampled between 2017–2022, located in roadsides (Sydenham et al., 2024; Sydenham et al., 2023) and semi-natural grasslands (Sydenham et al., 2022a; Sydenham et al., 2022b; Sydenham et al., 2022c).
We used binomial GLMMs (R package ‘lme4’) with the occurrence of a species (presence/ absence) as response variable to test for interactions between flower preference niche score (DCA1) and the bioclimatic environmental gradients (PC1 and PC2) while controlling for landscape composition (PC1 and PC2).
