Skip to main content

16S rRNA data for an in vitro model of the human dental plaque bacterial community (3 hosts)

Cite this dataset

Zhou, Baoqing; Mobberley, Jennifer; Shi, Kelly; Chen, Irene (2022). 16S rRNA data for an in vitro model of the human dental plaque bacterial community (3 hosts) [Dataset]. Dryad.


The creation of oral microcosms with reproducible composition is important for developing model systems of the oral microbiome. Here, we report on the outcome of a methodologically simple but scientifically informative approach, in which we sample the dental plaque microbiome from 3 individuals and characterize the variability among the microbiomes after storage and subsequent propagation. We use 24-well culture plates with artificially generated pellicle under a defined anaerobic atmosphere and an undefined medium supplemented by nutrients for fastidious organisms to generate the cultures, including the initial, preserved, and propagated cultures. Harvested cultures are extracted with the Qiagen PowerSoil kit. Culture composition is determined by 16S rRNA sequencing on the Illumina MiSeq platform and the mothur pipeline. Data analysis is performed in R with the phyloseq, and vegan. Our results show that cultures from 2 out of 3 individuals cluster into an ‘attractor’ compositional type, and the samples from the remaining individual can adopt this compositional type after in vitro propagation, even though the original composition did not display this type. The results suggest that simple selective environments could help create reproducible microcosms from different individuals, in this case, reproducible microcosms composed of early colonizers of the dental plaque bacterial community. The attractor composition also has potential implications for synergistic interactions between members of the Streptococcus and Veillonella genera, and for antagonism between members of the Streptococcus and Prevotella genera. Together, these findings show that this dental microbiome model may be a promising start of a reproducible in vitro microbiome model that captures common "baseline" members of the human oral bacterial community.


Plaque Collection and Cultures: 

Sample collection from volunteer hosts took place under protocols 3-18-0189 and 3-19-0119, approved by the UCSB Human Subjects Committee. Collection process was as follows: Supragingival plaque scrapes of molar teeth from three healthy adult hosts were collected with sterile metal curettes, after hosts had abstained from food, drink, and dental hygiene practices for 12 - 16 hours. For consistency, we limited the sites of the scrapes to the mucosal supragingival surfaces of 3 molar teeth, though with no restrictions on only upper or only lower molars. Plaque from each host was used to inoculate 6 mL of SHI media. Each 6mL of inoculated SHI media was equally divided among 3 wells in a sterile 24-well plate. Prior to receiving the inoculum, all plates were preconditioned with a pellicle layer, formed by addition of 150 μL of filtered clarified saliva (BioIVT) incubated at 37°C for 1 hour, and then sterilized with UV radiation at 254 nm for 1 hour. A separate plate was used for samples from each host. Three negative control wells were prepared in a fourth plate by receiving the pellicle layer and 2.0mL of SHI media without inoculum from host plaque. 

All plates were incubated at 37°C, in an anaerobic vessel with an atmosphere of 85% nitrogen, 10% hydrogen, and 5% carbon dioxide. Every 24 hours, approximately 1.3 mL of spent media was pipetted from the top of each well and replaced with 1.5 mL of fresh SHI medium, followed by a supplement of 20 μL of 0.5% sucrose. Plates were then returned to the aforementioned anaerobic atmosphere, from which they were extracted and processed after a total of 72 hours. These 72-hour cultures were termed “initial cultures”. Two of the three inoculated wells were used in the preservation experiment described below. The third was pelleted at 16,000 x g for 5 minutes and flash-frozen in liquid nitrogen for later analysis.

We investigated several methods of preserving the microbial communities derived from the initial cultures. After the initial 72 hours, two control wells and two wells from each host were selected, and the entire well of 1.75mL of culture was divided into five volumes of approximately 0.35mL each for the preservation experiments. Sterile glycerol was added to one of the five aliquots to a final concentration of 20%, and this aliquot was flash-frozen for later analysis and comparison. The remaining four aliquots were subjected to the following four preservation conditions: 4°C for 1 day, 4°C for 3 days, cryopreservation with 40% glycerol at -80°C for 1.5 weeks, and the same cryopreservation conditions for 5.5 weeks. After preservation,  approximately 170 μL of the 350 μL sample was divided into 2 equal aliquots, flash-frozen with 40% glycerol, and stored at -80°C for later analysis.

To assess microbial communities after revival from the preservation process, we used the remaining 180 μL of each 350 μL volume to inoculate approximately 6.5 mL of sterile SHI medium. The inoculated media was subsequently split into 3 wells in plates preconditioned as detailed above. These plates were incubated at 37°C for 48 hours, under the previously mentioned atmospheric conditions with replacement of non-sedimented liquid and supplementation of sucrose. From the resulting cultures, termed ‘propagated’ cultures, one well was pelleted at 16,000 x g for 5 minutes and flash-frozen as a backup. The other two wells were mixed with glycerol to a concentration of 20%, pelleted as above, flash-frozen in liquid nitrogen, and stored at -80°C until further processing.

DNA Extraction:

Initial cultures, preserved cultures, and propagated cultures, along with positive and negative controls, were processed using the DNeasy PowerSoil kit (Qiagen). Negative controls included culture controls mentioned above as well as extraction controls, which consisted of 200 μL of fresh 1X PBS. Positive controls were 200 μL of ZymoBiomics microbial community standard (D6300). Prior to extraction, samples were randomly divided into batches of 12 or fewer tubes (which could be processed in a single batch). Extraction was begun by incubating frozen pellets on ice and washing pellets with 1X PBS three times. Pellets were then resuspended in 200 μL of 1X PBS and DNA was extracted according to manufacturer’s instructions. 

The concentration of extracted DNA was measured using a fluorometric kit (QuantIT PicoGreen; Invitrogen). Estimation of bacterial biomass post-extraction was performed using the SSoAdvanced Universal SYBR Green Supermix qPCR assay (Biorad) using gene-specific primers designed to amplify the V4 region from the 16S rRNA gene. Quantitation by qPCR showed that experimental samples contained greater than 10,000 copies per μL (with the exception of two out of eighty-two samples) and all negative controls contained 17 to 2885 copies per μL. These data confirmed that the experimental samples contained sufficient bacterial biomass for sequencing. The very low copies of 16S rRNA in the negative control samples was consistent with the expected low bacterial biomass. All samples were included for sequencing.

Amplicon Library Construction and Illumina MiSeq Sequencing:

16S rRNA gene amplicon libraries were constructed following the dual-index sequencing protocol developed by Kozich et al using the specific gene primers (515F-805R) designed by the earth microbiome project. This primer set leads to 250 bp amplicons, enabling full coverage of the region by the read 1 and read 2 sequencing primer. Amplicons were generated on 96-well plates using 1 μL of template DNA, 1 μL of each index primer and 17 μL of Accuprime Pfx Supermix. Each plate contained a negative control well (1 μL of molecular grade water) and a positive control well (1 μL of  ZymoBiomics community DNA standard). The PCR was performed under the following conditions: initial denaturation at 95°C for 2 minutes, followed by 30 cycles of denaturation at 95°C for 20 seconds, annealing at 55°C for 15 seconds, and extension at 72°C for 1 minute, followed by a final extension at 72°C for 10 minutes. Amplicon purification using AMPure XP beads was performed, followed by normalization and pooling to equimolar amounts of each sample. Each sample was sequenced in duplicate on different plates. Illumina MiSeq sequencing with PE300 V3 chemistry was performed in the genetics core of the Biological Nanostructures Laboratory within the California NanoSystems Institute at UCSB.

16S rRNA Bioinformatics Analysis:

The mothur software package (v. 1.40) was used to process the paired-end Illumina 16S rRNA gene reads. Briefly, read pairs were assembled into contigs, and contigs with ambiguous reads were excluded, to obtain 13,249,077 total contigs. The contigs were aligned to version 132 of the SILVA SSU reference non-redundant database, and those that did not align were removed. A denoising algorithm and chimera identification with UCHIME were performed for additional quality control, resulting in 11,662,068 high quality sequences. The contigs were clustered at the 3% dissimilarity level to generate operational taxonomic units (OTUs). Sequences were classified against the Silva SSU reference non-redundant database (version 132) using a naive Bayseian classifier (80% pseudobootstrap confidence score cutoff). The ZymoBiomics Community DNA served as our “mock community” to calculate sequencing error rates. When these mock communities were examined using the above pipeline, an overall error rate of 0.019277% was observed and 8 OTUs were detected (as expected from the 8 expected bacterial species in the mock community), indicating that the read processing pipeline has a low error rate and does not drastically overestimate diversity.

Statistical Analysis:

All statistical analyses were performed with R (v. 4.0.3) in RStudio (v. 1.3.959), with the phyloseq package (v. 1.34.0). Before analyzing the cultures, we examined both negative controls (culture controls at all stages of preservation and propagation and DNA extraction controls) for potential contamination by comparing the number of reads and OTUs between negative controls and cultures. We also examined positive controls (commercially available mock bacterial communities) by comparing the observed microbial DNA community (Zymo D6305) and the extracted microbial community (Zymo D6300) with the expected species distribution. After verifying that the negative and positive controls yielded expected results, we examined sequencing depth by host and by sample type. To account for differences in sequencing depth, we also examined the number of OTUs by host and sample types, and then rarefied samples to a uniform level (240 reads per sample, the lowest number of reads among cultures) and observed the number of OTUs detected after rarefaction. To investigate potential underestimation of OTUs due to a low number of reads, we constructed rarefaction curves for all samples, excluding one outlier with more than 700,000 reads and two samples with fewer than 1,000 reads (out of eighty-two samples). To determine whether any correlation existed between sequencing depth and the number of OTUs (taken to represent the number of species), we plotted the number of reads against three diversity indices while omitting controls, mock communities, samples with fewer than 1,000 reads, and an outlier with more than 700,000 reads. To determine the most common phyla across these same samples, we generated prevalence plots of sequencing depth vs. taxa prevalence expressed as a percentage. Rarefied samples were used for the following analyses, including OTUs with relative
abundances greater than 0.1%.

To examine the distribution of OTUs, we plotted relative abundances according to preservation conditions, for each host. Principal coordinate analysis (PCoA) was performed using relative abundances and the Bray-Curtis dissimilarity metric. Dot plots were used to examine the five most prominent OTUs, averaged by preservation conditions and categorized by host. Principal component analysis (PCA) was also performed on the dataset of initial, preserved, and propagated cultures. A scree plot was used to assess the contribution to variation from components 3 and beyond in the PCA. To assess the validity of performing PCA on relative abundance data, we used centered-log-ratio (CLR) and isometric log-ratio (ILR) transformations on the data and subjected the transformations to PCA. The mathematical procedure for CLR transformations is outlined in Chapter 2. The ILR transformation is defined as follows: 

ilr(x) = [<x; e1>a; ...;<x; eD-1>a]

where [e1,...,eD-1] is an orthonormal basis in the simplex. Alternatively, ILR transformations can be done by using CLR basis sets in the following way:

ilr(x) := Vtclr(x)

where clr(x) is the centered log-ratio transform and V is a matrix whose columns form an orthonormal basis in the CLR plane. For this analysis, we used the default parameters from the ilr function of the compositions package in R to transform the relative abundances, and then subjected the transform to ordinary PCA procedures. We also performed significance tests using the ANOSIM function in the vegan package, with Bray-Curtis distance measurements and 999 permutations to test whether and how much preservation conditions and/or host differences influence the relative abundance differences across samples.


Institute for Collaborative Biotechnologies, UCSB