Data from: Pollen diet, more than geographic distance, shapes provision microbiome composition in two species of cavity-nesting bees
Data files
Dec 24, 2025 version files 4.84 MB
-
bacterialasv.csv
935.97 KB
-
bacterialtaxonomy.csv
474.95 KB
-
fungalasv.csv
1.12 MB
-
fungaltaxonomy.csv
1.49 MB
-
pollenasv.csv
322.66 KB
-
pollentaxonomy.csv
475.16 KB
-
README.md
3.36 KB
-
sample_data_export-mod.csv
15.83 KB
Abstract
The microbial composition of stored food can influence its stability and determine the microbial species consumed by the organism feeding on it. Many bee species store nectar and pollen in provisions constructed to feed developing offspring. Yet whether microbial composition is determined by the pollen types within provisions, variation between bee species at the same nesting sites, or geographic distance was unclear. Here, we sampled two species of co-occurring cavity nesting bees in the genus Osmia at 13 sites in California and examined the composition of pollen, fungi, and bacteria in provisions. Pollen composition explained 15% of variation in bacterial composition and ~30% of variation in fungal composition, whereas spatial distance among sites explained minimal additional variation. Symbiotic microbe genera Ascosphaera, Sodalis, and Wolbachia showed contrasting patterns of association with pollen composition, suggesting distinct acquisition and transmission routes for each. Comparing provisions from both bee species comprised of the same pollens points to environmental acquisition rather than bee species as a key factor shaping the early stages of the bee microbiome in Osmia. The patterns we observed also contrast with the Apilactobacillus-dominated provision microbiome in other solitary bee species, suggesting variable mechanisms of microbial assembly in stored food among bee species.
Dataset DOI: 10.5061/dryad.crjdfn3h9
Description of the data and file structure
Files and variables
File: bacterialtaxonomy.csv
Description:
Variables
- ASV: Actual sequence variant
- X: Placeholder
- Kingdom: Assigned Kingdom
- Phylum: Assigned Phylum
- Class: Assigned Class
- Order: Assigned Order
- Family: Assigned Family
- Genus: Assigned Genus
- Species: Assigned Species
File: sample_data_export-mod.csv
Description:
Variables
- SampleLabel: Sample ID
- Site: Sampling site label
- Species: Bee species, L = Osmia lignaria, R= Osmia ribifloris
- Nest: Nest ID
- CellPos: Cell position within the nest, with 1 indicating the first cell provisioned in a given nest
- Type: Indicates if samples were taken in an agricultural or natural site
- Year: Year sampled
- FungiCode: Code for matching with the fungal dataset
- Kit: Which kit was used for DNA extraction
- SiteMatch: Site label used for matching
- CollectionDate: Date that the cell was collected
- Lat: Latitude of collection site
- Lon: Longitude of collection site
- County: County of collection site
- RibNests: Estimated total number of O. ribifloris nests at that site
- LigNests: Estimated total number of O. lignaria nests at that site
- Sample_or_Control: Indicates if this is a sample or control (blanks removed for this dataset)
- is.neg: Indicates if this is a negative control
File: pollenasv.csv
Description:
Variables
- SampleLabel: Sample label for matching with sample data
- The remaining columns indicate pollen ASVs that match to Pollen taxonomy worksheet, and values indicate the raw sequence counts within a given sample
File: pollentaxonomy.csv
Description
Variables
- Sequence: Sequence (matches to pollen ASV)
- X: Placeholder
- Kingdom: Assigned Kingdom
- Phylum: Assigned Phylum
- Class: Assigned Class
- Order: Assigned Order
- Family: Assigned Family
- Genus: Assigned Genus
- Species: Assigned Species
File: fungalasv.csv
Description:
Variables
- SampleLabel: Sample label to match to sample metadata. The
- remaining columns are counts of each ASV present within each sample
File: bacterialasv.csv
Description:
Variables
- SampleLabel: Sample label to match to sample metadata. The
- remaining columns are counts of each ASV present within each sample
File: fungaltaxonomy.csv
Description:
Variables
- Sequence: Sequence (matches to fungal ASV)
- X: Placeholder
- Kingdom: Assigned Kingdom
- Phylum: Assigned Phylum
- Class: Assigned Class
- Order: Assigned Order
- Family: Assigned Family
- Genus: Assigned Genus
- Species: Assigned Species
Code/software
No specific software is needed to view these data, but we used R and RStudio, along with the R package "phyloseq," to combine these files into a single phyloseq object and perform data manipulations and statistical analyses.
Access information
Other publicly accessible locations of the data:
- Sequence data have been deposited to NCBI SRA under BioProject ID PRJNA1246220.
Data was derived from the following sources:
- NA
Study system
We sampled the stored nectar and pollen within brood cells of two species of cavity-nesting Osmia (mason bees, Megachilidae) that nest in early spring with partially overlapping phenology, and that often co-occur. It is common to see nests of the two species within the same piece of wood nesting substrate within a microsite. Osmia lignaria is managed as a pollinator for early-flowering orchard crops and is a diet generalist (Williams & Tepedino 2003). Osmia ribifloris is a diet specialist that forages mainly on host plants in the Ericaceae (Sampson et al. 2013). Both Osmia species are common across the western United States and restricted to North America as of 2025 (discoverlife.org). Nests of each species are distinguished by nesting material: O. lignaria constructs nest partitions using mud, while O. ribifloris uses masticated leaves (Fig. 1).
Sampling design
We sampled Osmia populations at 13 geographic locations, hereafter ‘sites’, in the foothills of the western slope of the Sierra Nevada in California (Fig 2) where populations of each species were established or bees were newly released. At each site, nesting was encouraged by the placement of clean (washed and heat-treated) nesting blocks in early spring. In 2020, populations were sampled at 10 sites, where sites hosted natural populations of either or both species (Table S1). In 2022, populations of O. ribifloris were released at 3 agricultural sites and a subset of the natural source populations were also re-sampled (Table S1).
To collect provision material, we transferred whole provisions to sterile 1.7 mL centrifuge tubes using single use plastic pipets or sterile toothpicks. Provisions were sampled when brood cells contained either an egg (most common), 1st or rarely 2nd instar larvae, and no frass was present in brood cells. Larvae of these species have a blind gut and do not defecate until the 5th instar (Torchio 1989). Collected provisions were stored at -80°C until DNA extraction. At each site, we sampled provisions from 3 nests per species and 2 brood cells within each nest, or as many as were available.
Sample Processing
DNA was extracted from approximately a third of each provision using the Qiagen PowerSoil kit (Germantown, MD, USA) following manufacturer’s instructions. Extracted DNA was quantified using nanodrop and submitted for amplicon sequencing at Dalhousie Integrated Microbiome Resource (IMR, Dalhousie University, Nova Scotia CA) Sequencing facility. Plant sequences were targeted using rbcL primers rbcLaF (5’-ATGTCACCACAAACAGAGACTAAAGC-3’) and rbcL506R (5’-AGGGGACGACCATACTTGTTCA-3’), bacterial 16S rRNA sequences were amplified using 799F (799F=5'-AACMGGATTAGATACCCKG-3) and 1115R (1115R=5'-AGGGTTGCGCTCGTTG-3') to reduce contamination with mitochondrial and chloroplast sequences and fungal ITS sequences were amplified using ITS86 (5’-GTGAATCATCGAATCTTTGAA) and ITS4 (5’-TCCTCCGCTTATTGATATGC-3’). Kit extraction blanks were also included in sequencing runs for all sample types (N=2/region). PCR was performed in duplicate using high-fidelity Phusion Plus polymerase, followed by amplicon cleanup and normalization using Charm Just-a-Plate Normalization kit, and sequencing using MiSeq PE300 was performed by Dalhousie IMR.
Sequence data processing
Sequence data were quality filtered, trimmed, error-corrected, merged, and had chimeras removed using the DADA2 pipeline (Callahan et al. 2016), see full code in accompanying Dryad and Zenodo repository. This generated a table of actual sequence variants (ASVs) present across each sample. Sequence variants detected in the negative controls were removed from the dataset using Decontam (Davis et al. 2018), resulting in the removal of 8 fungal ASVs, 3 pollen ASVs and 14 bacterial ASVs. Taxonomic annotation was performed using the UNITE database (v7.2) for ITS sequences (Nilsson et al. 2019), BEExact v2023.01.30 and SILVA v138.1 for 16S sequences, and plant database for rbcL sequences (Bell & Brosi 2016). Manual BLAST searches were performed for the most abundant 100 ASVs in each dataset when unannotated at the family level, or when 16S databases did not match, and updated when BLAST searches indicated high confidence annotations. Data wrangling was performed using Phyloseq (McMurdie & Holmes 2013). All ASVs that were not assigned to an appropriate kingdom based on the sequencing region (e.g. fungi for ITS) and a phylum within that kingdom were filtered out; reads assigned as mitochondria or chloroplast were also filtered out from ITS and 16S datasets. Following sequence curation, species accumulation curves were examined to ensure saturation (Figure S1), with samples dropped below a minimum of 500 reads for pollen (mean = 25,110, 2 samples dropped), 450 reads/sample for fungi (mean=11,564, none dropped), and 500 for bacteria (mean=6,570; 20 samples dropped).
