Skip to main content

Wild bumble bee colony abundance, scaled by field size, predicts pollination services

Cite this dataset

Fleischer, Shelby; Strange, James; Lopez-Uribe, Margarita; McGrady, Carley (2021). Wild bumble bee colony abundance, scaled by field size, predicts pollination services [Dataset]. Dryad.


Although bee visitation rate to flowers is often used to assess both pollination services and bee abundance, the abundance of social species needs to be assessed by quantifying the number of colonies instead of the number of foraging individuals. Because accurately quantifying the number of wild bee colonies can be difficult, the relationship of visitation rates provided by foragers and the abundance of colonies contributing those foragers from the surrounding landscape is poorly documented for social species.  Here, we use genetic methods and statistical inference to estimate the abundance of wild colonies of Bombus impatiens in the surrounding landscape providing foragers to 30 commercial pumpkin (Cucurbita pepo) fields over 4 years across a 13,000 km2 area in Pennsylvania (U.S.A.).  We show that the abundance of wild colonies in the surrounding landscape providing foragers per field (colony abundance per field) ranges from 291 to 829 colonies per field. Furthermore, colony abundance per field has been stable across years, counties, and field size, resulting in a dilution of foragers from the available colonies across larger fields.  Wild colony abundance when scaled by field size, by expressing colony abundance on a per hectare basis, is predictive of visitation rate. Thus, we document the relationship of wild colony abundance per hectare to visitation rate at a level that provides sufficient pollination to a highly pollinator-dependent crop. As expected, genetic differentiation among sampled populations is essentially non-existent across different fields or regions, suggesting a panmictic population.  Although many Bombus species are in decline, we document abundant and genetically resilient wild populations of B. impatiens associated with a mass-flowering crop under current agricultural practices and provide baseline information needed to monitor these wild populations at a time when they face similar stressors implicated in the decline of congenerics.



Over the course of 4 years across 3 geographical regions, we conducted our studies in 30 commercial pumpkin fields in Pennsylvania (USA), a state that consistently ranks among the top 5 in pumpkin production in the U.S. (USDA NASS 2019).  We worked in three counties (Columbia, Lancaster, Centre), primarily in fields destined for wholesale markets, which include the largest fields in the state.  All fields were mapped, and field area was calculated using the polygon feature in Google Maps.

From a subset of 21 fields, pollination activity was measured between 0630 and 1200 hours when weather conditions were favorable for bee activity (> 15.5oC with low wind speeds).  A full analysis of these visitation data at a community level for 37 species, and their relationship with yield, is in McGrady et al. 2020. However, we provide a summary of the relevant methods for this work.  Visitation rates were measured within each pumpkin field by walking along four 100 m transects spaced 0, 25 m, 50 m and 100 m from the field edge. Along each transect, the number of B. impatiens visits to all male and female pumpkin flowers within ~ 1.5 m2 was recorded in 45 second intervals. Approximately 60 observations were completed for each of the 4 transects per field, for a total of approximately 240 observations per field (sampling support details are provided in Appendix 1). For analysis, all observations were averaged to provide a mean visitation rate per field, which we use here to relate to colony abundances also at a field level.  Variation of visitation rates at a sub-field scale, broken out for flower gender and bee taxa, is addressed in detail in McGrady et al. 2020.

We used microsatellite markers to estimate the number of colonies supplying foragers to each of our 30 pumpkin fields. After completing visitation rate measures, we collected approximately 200 B. impatiens foragers per field for a total of ~6,000 individuals. Each collection was completed on a single morning (0800 – 1200 h) during peak pumpkin flower bloom (range: July 24th – Aug 29th) within 2 weeks of visitation measures. Along random walks throughout the entirety of each field, actively foraging bees were collected by placing a 20 ml scintillation vial over a worker interacting with the reproductive portion of the pumpkin flower. Collected specimens were pin mounted to verify species identification and the right mesothoracic leg was removed for DNA extraction.


Microsatellite genotyping

            Forager DNA was extracted by individually placing leg fragments into 150 µl Chelex® 100 (5%, in milli-q H20) with 5µl of 10 mg/mL Proteinase K. Each sample was heated in a Mastercycler® pro thermocycler for 60 minutes at 55°C, 15 minutes at 99°C, 1 minute at 37°C, and 15 minutes at 99°C.  Extracted DNA was stored at -20°C until PCR amplification. Genotypes were obtained for each worker using a multiplex PCR reaction with approximately 1.2 µl bee DNA, 2 µl PCR buffer (5X Colorless GoTaq® Flexi Buffer), 0.2 µl BSA (0.1 mg/ml), 0.08 µl of 5 u/µl Taq polymerase (GoTaq® Flexi DNA Polymerase), 0.6 µl of 10 mM dNTP’s, 0.56 µl of 25 mM MgCl2, 2.8 µl of molecular grade H2O and 11 primers. Forward primers were labeled with FAM (blue), VIC (green), PET (red) or NED (yellow) and used at a working solution of 10 μM. After optimizing our primer sets, the final reaction volume was 10.7 µl and included the following primers: BTMS0066, B124, Btern01, BT28, BTMS0062, BTMS0073, BT10, BL11, BT30, B96, and BTMS0081 detailed in Appendix 2 (Estoup et al. 1995, Estoup et al. 1996, Funk et al. 2006, Stolle et al. 2009). Polymerase chain reactions were performed using Mastercycler® pro thermocyclers with a 3 minute 30 second initial denaturation at 95°C, followed by 30 cycles of 30 seconds at 95°C, 1 minute 15 seconds at an annealing temperature of 55°C  and 45 seconds at 72°C, then a final 15 minute extension at 72°C  with a hold temperature at 15°C. Amplified PCR products were sized on an ABI3730XL® DNA Analyzer (Applied Biosystems) with GeneScan LIZ 500 internal size standard (Applied Biosystems) at the Penn State Genomics Core Facility – University Park, PA. We scored alleles manually using Geneious V10.0.7 (Biomatters Ltd). Foragers successfully genotyped at >7 loci were included in subsequent analysis (exceeds 6 loci minimum precedence set in Lepais et al. 2010).


Colony Abundance

              To estimate colony abundance per field, genotyped foragers were assigned to full-sibship families (FS families, commonly referred to as detected colony numbers, represent a single mother, single sire group) using the maximum likelihood method implemented in COLONY v. (Jones and Wang 2010) assuming monogamous mating.  It is logistically impossible and ethically irresponsible to exhaustively sample every bee at a given location, and therefore detected colony numbers are likely an underestimate of total colonies providing foragers to a site because foragers representing some colonies would not have been collected. Therefore, we used Capwire v. 1.0 (Miller et al. 2005, see Pennell et al. 2013 for use with R) to estimate total colony abundance by determining the number of unsampled colonies based on the probability distribution of detected colonies represented by 1, 2, …, k foragers per site. Capwire provides two estimates of total colony abundance, that vary based on assumptions of within-field distribution, detailed in Goulson et al, 2010. In keeping with previous studies and biological assumptions of non-random within-field distribution, we used estimates based on the Two Innate Rate Model (TIRM) method. In order to scale colony abundance by field size, we used these estimates of colony abundance per field to calculate the number of colonies providing foragers per hectare of pumpkin by dividing the number of total colonies per field by the field area, thus creating a metric of colony abundance per hectare. Due to field management practices, we do not expect B. impatiens to be nesting within pumpkin fields, and we never encountered nests within fields during our sampling.  Our metrics reflect the number of B. impatiens colonies from the surrounding landscape which had foragers visiting pumpkin flowers, on a per field and per hectares basis.

To explore the stability of estimated colony abundances per field across time and space, we used a two-way ANOVA on a subset of 28 fields to evaluate the effect of year, region and their interaction on colony abundance per field. Fields from 2012 (n = 2) were excluded because only one region (Columbia county) was sampled in 2012.  We also used one-way ANOVAs to test the effect of year (2012, 2013, 2014 and 2015) and region (Center, Columbia and Lancaster counties) on mean estimated colony abundances per field using all 30 fields.

            We used simple linear regression to examine the relationships between pumpkin field area and both colony abundance per field and colony abundance per hectare. To explore the effect of mass-flowering crops on pollination services, we used simple linear regression to examine the relationship between commercial pumpkin field area and B. impatiens visitation rates to pumpkin flowers. 

            To explore the relationship between wild bumble bee colony abundance and pollination services, we used simple linear regression to examine the effect of B. impatiens colony abundance per field and colony abundance per hectare independently on B. impatiens visitation rates to pumpkin flowers.

We used JMP® Pro, Version 13.0.0 (SAS Institute 2007) to complete all Analysis of Variances (ANOVA), mean comparisons and regressions. For all analyses, significance was set at alpha equals 0.05. Simple linear regressions were completed using “Fit Model” with model personality “Standard Least Squares”, and emphases “Effect Leverage.” For curvilinear relationships, quadratic terms were tested. Visitation rates and colony abundances per field were normally distributed and did not require transformations. After removing a single outlier, colony abundances per hectare were also normally distributed.


Population genetic patterns

We removed duplicate members of each FS family such that large colonies would not be overrepresented and bias genetic tests which were computed in R (Appendix 3).  To assess a single generation at a time, we analyzed foragers from each year separately. We estimated population structure by field and region using G-statistics and Analysis of Molecular Variance (AMOVA). We calculated expected heterozygosity (HE) and allelic richness (AR) across the entire population. Expected heterozygosity (HE) is based on Nei’s unbiased estimated of gene diversity and was calculated using R package and function “poppr” (Kamvar et al, 2014) with sample sizes standardized to the smallest of 293 genotypes per year. Values range from 0 – 1, with 1 the highest level of diversity. Allelic richness (AR) was calculated per loci using 100 alleles for rarefaction to correct for varying sample sizes between years with the function “allele.richness” in the R package “hierfstat” (Goudet, 2005). AR was averaged across all loci per year to provide a single value of AR per site per year.  Values range from 0 – infinity, with higher values indicating higher allelic diversity.  We also calculated inbreeding coefficients (FIS) using “boot.ppfis(x)” in the R package “heirfstat” (Goudet, 2005). When the 95% confidence interval includes 0, the FIS is not significantly different from 0 - which indicates no inbreeding (i.e., random mating for the population).

We tested for genetic differentiation among individuals using a comparison of three G-statistics (Nei’s Gst, Hedrick’s Gst, Jost’s Dest) with ‘diff_stats’ in the R package “mmod” (Winter 2012). All three determine the degree of genetic differentiation between populations by comparing genetic diversity within and between populations. Values close to 0 indicate lack of differentiation, suggesting a panmictic population, whereas values close to 1 indicate populations that are completely differentiated (Meirmans and Hedrick 2011). For Jost’s DEST, we report values based on average heterozygosity as opposed to harmonic means.  Within each year, we tested for genetic differentiation between fields and then regions, using global and pairwise comparisons.

Using AMOVA, we looked for sources of genetic variation by partitioning variance among individuals, fields and regions for each year using ‘poppr.amova’ from the R package ‘poppr’ (Kamvar et al. 2014). Significant and relatively high variance between fields and regions could suggest population structuring. The genetic, visitation rate, colony abundance, and field size data are archived on Dryad (doi:10.5061/dryad.83bk3j9nf).


National Institute of Food and Agriculture, U.S. Department of Agriculture, under award number 2012-51181-20105, Developing Sustainable Pollination Strategies for U.S. Specialty Crops., Award: 2012-51181-20105