Skip to main content

Comparing bacterial microbiome composition of Xylocopa species across populations using PacBio 16S rRNA gene sequencing

Cite this dataset

Vannette, Rachel; Sbardellati, Dino; Handy, Madeline (2022). Comparing bacterial microbiome composition of Xylocopa species across populations using PacBio 16S rRNA gene sequencing [Dataset]. Dryad.


The gut microbiota of bees affect nutrition, immunity, and host fitness, yet the role of diet, sociality, and geographic variation in determining microbiome structure, including variant-level diversity and relatedness, remain poorly understood. Here, we use full-length 16S rRNA amplicon sequencing to compare the crop and gut microbiomes of two incipiently social carpenter bee species, Xylocopa sonorina and Xylocopa tabaniformis, from multiple geographic sites within each species’ range. We found that Xylocopa species share a set of core taxa consisting of Bombilactobacillus, Bombiscardovia, and Lactobacillus, found in >95% of all individual bees sampled, and Gilliamella and Apibacter were also detected in the gut of both species with high frequency. The crop bacterial community of X. sonorina was comprised nearly entirely of Apilactobacillus with occasionally abundant nectar bacteria. Despite sharing core taxa, Xylocopa species’ microbiomes were distinguished by multiple bacterial lineages, including species-specific variants of core taxa. The use of long-read amplicons revealed otherwise cryptic species and population-level differentiation in core microbiome members which was masked when a shorter fragment of the 16S rRNA (V4) was considered. Of the core taxa, Bombilactobacillus and Bombiscardovia exhibited differentiation in ASVs among bee populations, but this was lacking in Lactobacillus, suggesting that bacterial genera in the gut may be structured by different processes. We conclude that these Xylocopa species host a distinctive microbiome, similar to that of previously characterized social corbiculate apids, which suggests that further investigation to understand the evolution of bee microbiome and its drivers is warranted.


Between 2019 and 2020, 33 X. sonorina and 22 X. tabaniformis adults were collected. Bees were captured in one of three ways: netted while foraging, caught using traps over the nest entrance (Supplementary Figure S1), or through the excavation of logs to sample entire nests. The type of capture was recorded, with bees caught foraging in flight denoted as ‘foraging’ and those captured within or exiting a log denoted as ‘nest-caught’. Samples were obtained from Davis, CA (21 X. sonorina; 14 X. tabaniformis), Anza-Borrego Desert State Park in Southern California (1 X. sonorina, 8 X. tabaniformis), and Tempe, Arizona (11 X. sonorina). Samples from Davis and Anza-Borrego were captured during late summer and early fall while samples from Tempe were collected in early summer (see data file for collection date). Bee species can be distinguished morphologically, and identifications were verified using voucher specimens at the Bohart Museum of Entomology. Captured carpenter bees were photographed, then killed by placing them in a -20 oC freezer where they were stored until dissection. 

(B) Sample processing and DNA extraction: Before dissection, carpenter bee samples were rinsed in 70% ethanol for fifteen seconds. They were then air-dried and placed in a sterile petri plate for dissection. Dissecting tools were flame sterilized before dissection and before each organ removal. The crop and the rest of the gut (combined midgut and hindgut) were separated and stored at -20oC until DNA extraction.

Microbial DNA was extracted from individual crop and gut samples separately and kit reagents only as a blank extraction control, using the Qiagen DNeasy PowerSoil Kit with slight modifications (Rubin et al. 2014). Modifications include adding 4 magnetic beads per PowerBead Tube after tissue samples had been added and beating tubes in a BeadBlaster 24 Homogenizer for 3 cycles of speed 7 for 20 seconds per cycle. Then 60 μL of Solution C1 and 2 μL Proteinase K solution (600mAU/ML – from Qiagen Tissue and Blood) were then added to each tube and tubes were incubated overnight at 56oC. The following day tubes were beaten once more using the same cycle settings and the rest of the protocol followed the manufacturer’s protocol beginning at step 6. Extracted DNA was sent for library prep and sequencing at Dalhousie University Integrated Microbiome Resource Facility. Briefly, full-length 16S rRNA region was amplified in duplicate using 27F AGRGTTYGATYMTGGCTCAG (Paliy et al. 2009) and 1492R RGYTACCTTGTTACGACTT (Lane 1991) using full-length fusion primers (PacBio adapters + barcodes + specific regions). PCR products were visualized using a Hamilton Nimbus Select robot using Coastal Genomics Analytical Gels. PCR products were pooled within a sample, cleaned and normalized using Charm Biotech Just-a-Plate 96 Well Normalization kit and quantified fluorometrically before sequencing. PacBio Sequel 2 chemistry was used in sequencing, performed by the Dalhousie University Integrated Microbiome Resource facility (Halifax, Canada).

(C) Bioinformatics: Preliminary processing and filtering of raw full-length 16S rRNA reads into Amplicon Sequence Variants (ASVs) was performed in R v4.1.0 (R Core Team, 2021) using DADA2 (v1.20.0) (Callahan et al. 2019). Primer sequences were removed, and reads were filtered by size and quality to yield sequences ranging from 1000 – 1600 bp with no ambiguous bases, 2 maximum expected errors, and a minimum quality score of 3. Filtered reads were then dereplicated, and sequencing errors were inferred using the PacBioErrfun function and removed. Chimeras were inferred with a minFoldParentOverAbundance value of 3.5 and removed using sequence consensus as a method. Finally, taxonomy was assigned using the BEExact database (Daisley and Reid 2021) and SILVA v 138.1 (Quast et al. 2012); resulting taxonomy was nearly identical so we present assignments from BEExact below.

The ASV and taxonomy tables generated from the DADA2 pipeline outlined above were merged with metadata using phyloseq (McMurdie and Holmes 2013). ASVs classified as chloroplast or mitochondria were removed, any ASVs found in the control sample were removed, and only samples with greater than 500 total sequences, were retained (see sampling curves in Supplementary Figure S2). 

Usage notes

R, Dada2


National Science Foundation of Sri Lanka, Award: 1929516

UC Davis Provost's Fellowship

University of California Natural Reserve System