Influence of environmental covariates on pollinator community occupancy, detection, and richness across urban gardens in Richmond, Virginia (U.S.A.)
Data files
Oct 17, 2025 version files 288.15 KB
-
README.md
8.89 KB
-
Ruppel_et_al_2024_archive.zip
279.26 KB
Abstract
Pollination is an essential ecosystem service that supports reproduction and propagation of most of the world’s flowering plants. The dramatic decline in pollinators, especially insect pollinators, due to climate change and pesticide use threatens not only our food supply, but also the diversity of native plants. Urban areas, if well managed, can serve as corridors and reserves for pollinator species and benefit agricultural and natural ecosystems well beyond the urban environment. In this study, we assessed Mid-Atlantic (U.S.A.)-region urban garden plant-pollinator interactions, focusing on activity associated with two regionally-native plants: dense blazing star (Liatris spicata; Asteraceae) and clustered mountain mint (Pycnanthemum muticum; Lamiaceae). We conducted 350 visual surveys across 52 gardens and identified 14 taxa in 361 detection events, with 5 taxa dominating at 331 detections. We built multi-species occupancy models (MSOMs) in a Bayesian framework using site and survey covariates to evaluate variables that influenced species occupancy, detection, and richness. We found little influence of any variables on occupancy, and the intercept-only models resulted in species-specific occupancy that ranaged from 0.04 (Halyomorpha halys) to 0.86 (Halictidae). For detection, we found that plant species and survey start time (or the interaction between the two) influenced detection of a majority of pollinators at the community level, while Julian date and urban distance (interaction) influenced a small number of species. Comparisons between the two plant species indicated that honey bees (Apis mellifera) and wasps (Vespoidea) were more likely to be detected on P. muticum compared to L. spicata, while the reverse was true for A. campestris. All taxa became more detectable as it became later in the day. A. mellifera and Bombus spp. had higher detection earlier in the year. Halictidae detections increased closer to the urban areas, while Bombus spp. detection increased farther from urban areas. The posterior medians of the number of taxa per site ranged from 5 – 8 and showed little evidence of differences across sites, but the composition did vary. The estimated number of taxa occurring across all sampled sites was 18, indicating that ~25% of taxa present at our study sites went completely undetected. Our study demonstrates that MSOMs can be an effective tool for monitoring and investigating the pollinator community. We were able to estimate occupancy for 14 observed insect taxa, 9 of which were detected fewer than 8 times. We also estimated effects of detection covariates that impacted multiple taxa and provide insight into ways to improve future pollinator monitoring efforts. These findings further our understanding of how plant species and the urban setting may variably influence pollinator activity and highlight the importance of urban gardens in supporting divserse insect communities.
Data are the result of visual pollinator surveys conducted in the greater Richmond, VA area at 50 gardens (sites) from June 28 - July 27, 2021. At each garden, observations were conducted on Liatris spicata or Pycnanthemum muticum, or both. For a given plant type, 5 separate inflorescences on different plants (spatial replicates) were observed for 5 minutes. Pollinator visits were tallied and identified to the lowest taxonomic level possible. 20 gardens were surveyed on both plant types (but different days for each plant type), for a total of 10 spatial replicates, while 20 other gardens were surveyed only on P. muticum (5 spatial replicates) and 10 gardens were surveyed only on L. spicata (also 5 spatial replicates).
Potential explanatory variables (covariates) were measured, which can broadly be divided into two types. Site covariates did not vary over time and were consistent across all spatial replicates at a given site, whereas survey covariates could vary across spatial replicates at a given site. Survey covariates can be further distinguished between values that were only measured once per site visit (thus they would differ between plant types, but not between spatial replicates on the same plant type) vs those for which unique values were measured for every spatial replicate.
Data were analyzed in a multi-species occupancy modeling framework, as demonstrated in the included R scripts (see Code/Software section).
Description of the data and file structure
L.spicataDetection.xlsx and P.muticumDetection.xlsx
These are detection histories indicating the results of visual surveys on L. spicata and P. muticum, respectively. Each Excel file has 14 sheets, corresponding to the 14 taxa that were detected at least once. Within a given sheet (taxon), rows correspond to the 50 sites surveyed, the Site column are arbitrary, unique, numerical identifiers for each garden which are consistent across all sheets and files in the data set, and the remaining columns correspond to the 5 spatial replicates of that plant type. Blank cells indicate the corresponding plant type was not surveyed at that site and should be read into R as NA. 1s indicate the taxon was observed at least once at that site on that spatial replicate, and 0s indicate the taxon was not observed at that site during that spatial replicate.
| Notation | Scientific name |
|---|---|
| Bombus.spp. | Bombus spp. |
| XyVi | Xylocopa virginica |
| Apis | Apis Mellifera |
| Halictidae | Halictidae family |
| CoOc | Coelioxys octodentatus |
| AnOb | Anthidium oblongatum |
| Wasps | Vespoidea superfamily |
| ErHo | Erynnis horatius |
| MaLi | Macrosiagon limbatum |
| HaHa | Halyomorpha halys |
| PiRa | Pieris rapae |
| AtCa | Atalopedes campestris |
| Megachile.spp. | Megachile spp. |
| EpBo | Epilachna borealis |
SiteCovariates.xlsx
Excel file with a single sheet. Because values of these covariates do not vary across spatial replicates, there is no need to distinguish between plant types. Rows correspond to sites. The Site column are unique identifiers as described above, Longitutde and Latitude columns are geographic coordinates in decimal degrees with WGS 84 datum (EPSG:4326). UrbanDistance is the distance in meters from the Virginia State Capital building in Richmond, and GardenArea is the overall size of the garden in m2.
L.spicataSurveyCovariates.xlsx and P.muticumSurveyCovariates.xlsx
These Excels contain survey covariates, the values of which may vary between spatial replicates at the same site, corresponding to observations on L. spicata or P. muticum, respectively. Each file includes 3 sheets, ByPlantType, Illuminence, and StartTime. For a given plant type, all spatial replicates were surveyed during the same visit to that site. Accordingly, some covariates were measured only once per site visit, thus they do not vary between spatial replicates on the same plant type, but do vary between plant types (i.e. between the two files). These are recorded in the sheet ByPlantType. Rows again correspond to sites and the site column are the same unique numeric identifiers described above. JulianDay records the day of the year in 2021 (e.g. January 01, 2021 would be 1, March 04, 2021 would be 63, etc.), Temperature is given in °C, and BloomRichness is a count of the number of nearby garden plants in bloom (excluding study species). Only plants on the property of interest were included. As in the detection files above, empty cells indicate the respective plant type was not surveyed at that site, and should be read into R as NA.
In addition to these, 2 covariates were measured separately at every spatial replicate, one each on the remaining 2 sheets. The Illuminence sheet records light levels, in lux, measured immediately before observations were conducted, and StartTime gives the time of day, in 24-hour notation, that observations commenced. On both sheets, the Site column are the same unique identifiers as above, and the remaining 5 columns are each of the spatial replicates on that plant type. Rows again correspond to sites.
Sharing/Access information
Data for 'distance from urban center' was derived from Google Map 2021; https://maps.google.com
Code/Software
Workflow and software versions
Three R scripts are provided, which are intended to be run in the following sequence: data_processing.R, JAGS_code.R, and parallelization.R. Files data_processing.R and JAGS_code.R were developed and tested on R version 4.0.2 and JAGS version 4.3.0, with R packages readxl (v 1.3.1), abind (v 1.4.5), R2OpenBUGS (v 3.2.3.2.1), R2jags (v 0.6.1), and rjags (v 4.10). File parallelization.R was tested a High Performance Computing system (HPC) running R version 4.0.3 and JAGS version 4.3.0 and R packages rjags (v 4.13), R2jags (v 0.7.1), snow (v 0.4.3), doSNOW (v 1.0.19), and foreach (v 1.5.1).
data_processing.R
The data_processing.R file reads in all of the above data, reformats them and combines them into a single R object, and standardizes all continuous covariates. The script will create and store the results in an intermediate file, prelim_bundle.RData. This serves as an input file for the next script.
JAGS_code.R
The file JAGS_code.R includes example R code to create JAGS input files (specified in the BUGS language within R), some final, model-specific data bundling, and an example of running a single model (with parallelization only across 3 MCMC chains) on an all NA data set to examine the induced priors. The script produces text files of BUGS code and final data bundles as .RData files, which in turn serve as inputs for the final parallelization.R script. Our full analysis included fitting over 35 different models, so it would be prohibitive to include code for every model variation tested during this variable selection process. Instead, we seek to provide a minimal set of code to provide sufficient examples to allow a user to recreate any of the intermediate models. The first example presented is our simplest, base/null model with only a single detection covariate that was included in all models that we tested. The second example presented is our final model with multiple detection covariates, and also includes deviance calculations used for the Bayesian P-value and model assessment. Because this final model did not include any occupancy covariates, the third example presented illustrates the incorporation of an occupancy covariate.
parallelization.R
Due to the large number of models and long run times of Bayesian models, parallelization was important to complete the analysis in a timely fashion. While not essential (models could be run sequentially), the file parallelization.R provides an example of how we improved efficiency in model run times. This code requires 9 cores, and took about 15.5 hours to run on a HPC with AMD EPYC 7702 cores (base 2 GHz, boost up to 3.35 GHz), each with up to 8 GB of available memory. The output file from the final model will be about 1.5 GB, while the outputs from the other two models will be around 150 MB. If these sizes are prohibitive for a particular computer system, the n.thin argument of the jags.parallel() function can be increased to make file sizes smaller.
Extracting results and visualizations followed standard processes for R2jags and rjags, thus we do not provide examples of such code.
Data are the result of visual pollinator surveys conducted in the greater Richmond, VA area at 50 gardens (sites) from June 28 - July 27, 2021. At each garden, observations were conducted on Liatris spicata or Pycnanthemum muticum, or both. For a given plant type, 5 separate inflorescences on different plants (spatial replicates) were observed for 5 minutes. Pollinator visits were tallied and identified to the lowest taxonomic level possible. 20 gardens were surveyed on both plant types (but different days for each plant type), for a total of 10 spatial replicates, while 20 other gardens were surveyed only on P. muticum (5 spatial replicates) and 10 gardens were surveyed only on L. spicata (also 5 spatial replicates).
Potential explanatory variables (covariates) were measured, which can broadly be divided into two types. Site covariates did not vary over time and were consistent across all spatial replicates at a given site, whereas survey covariates could vary across spatial replicates at a given site. Survey covariates can be further distinguished between values that were only measured once per site visit (thus they would differ between plant types, but not between spatial replicates on the same plant type) vs those for which unique values were measured for every spatial replicate.
Data were analyzed in a multi-species occupancy modeling framework, as demonstrated in the included R scripts (see Code/Software section).
