Skip to main content

Impact of smoking cessation, coffee and bread consumption on the intestinal microbial composition among Saudis: A cross-sectional study

Cite this dataset

Harakeh, Steve et al. (2020). Impact of smoking cessation, coffee and bread consumption on the intestinal microbial composition among Saudis: A cross-sectional study [Dataset]. Dryad.


The gut microbiota is often affected by the dietary and lifestyle habits of the host, resulting in a better efficacy that favors energy harvesting from the consumed food. Our objective was to characterize the composition of gut microbiota in adult Saudis and investigate possible association with lifestyle and dietary practices. Feces from 104 Saudi volunteers (48% males) were tested for microbiota by sequencing the V3-V4 region of bacterial 16S ribosomal RNA (rRNA). For all participants, data were collected related to their lifestyle habits and dietary practices. The relative abundance (RA) of Fusobacteria was significantly higher in normal weight Saudis ( p =0.005, FDR=0.014). Individuals who consumed more coffee presented marginally significant more RA of Fusobacteria ( p =0.02, FDR=0.20) in their gut microbiota compared to those reporting low or no coffee intake, but the RA of Fusobacteria was significantly higher in smokers compared to non-smokers p =0.009, FDR=0.027) . The RA of Fusobacteria was also significantly higher in those reporting daily consumption of bread ( p =0.005, FDR=0.015). At the species level, the gut microbiota of people who consumed coffee was dominated by Bacteroidetes thetaiotaomicron followed by Phascolarctobacterium faecium and Eubacteriumrectale . Similarly, the gut microbiota of smokers was also enriched by B. thetaiotaomicron and Lactobacillus amylovorus. Smoking cessation, bread and coffee consumption induce changes in the intestinal microbial composition of Saudis. This indicates the significance of diet and lifestyle practices in the determination of the composition of the gut microbiota, which could possibly lead later to changes in metabolic profile and weight.


All participants were asked to sign a written informed consent after being informed about the purpose of the study and ensured about confidentiality of the data. They were then requested to fill out a questionnaire covering their socio-demographic information, medical history and lifestyle practices. In addition, a structured food frequency questionnaire (FFQ) was administered to evaluate their dietary practices, and weight and height measurements were taken using standardized techniques. The used questionnaire was previously described partially or fully and used in other manuscripts [12-15]. Weight and height were used to calculate body mass index (BMI =  kg m-2) and the WHO criteria [16] were used to classify participants as underweight, normal, overweight and obese. Weight categories were defined according to BMI as follows: normal 20-25 kg m−2, underweight 18-20 kg m−2, overweight 25-30 kg m−2, and obese >30 kg m−2. Stool samples were collected in aseptic conditions with clean, dry screw-top containers and immediately stored at -20 °C.

Extraction of DNA from stool samples and 16S rRNA sequencing using MiSeq technology

All participants' stool samples were extracted using a deglycosylation protocol as follows: 250 µL of each sample was placed in a 2 mL tube containing a mixture of acid-washed glass beads (Sigma, Aldrich) and with two or three 0.5 mm glass beads. Mechanical lysis was performed by bead-beating the mixture using a Fast Prep BIO 101 apparatus (Qbiogene, Strasbourg, France) at maximum speed (6.5) for 3×30 seconds. The supernatant was centrifuged at 12,000 rpm for 10 min and the pellet retained. A mixture containing 2 µL of 10×glycoprotein denaturing buffer EndoHf (New England Biolabs) and 17 µL of H2O was added and heated at 100 °C for 10 minutes. Deglycosylation was performed adding a mixture of 2 µL of 10×G5 reaction buffer (ref B1702 New England Biolabs), 2 µL of EndoHf (New England Biolabs), 2 µL of cellulase (Sigma) and 16 µL of H2O. The preparation was then incubated overnight at 37 °C. Finally, DNA was extracted using the NucleoSpin® Tissue Mini Kit (Macherey Nagel, Hoerdt, France) according to a previously described protocol [17]. The quantity, purity, integrity and size of DNA and its amenability to PCR amplification were assessed. The concentration of each DNA extraction was measured by a Qubit assay with the high sensitivity kit (Life technologies, Carlsbad, CA, USA) according to the Nextera XT DNA sample prep kit (Illumina) and diluted to 1 ng aliquots of each metagenome for paired end sequencing analysis. DNA extracts were dispensed into 10- to 20-μL single-use aliquots and frozen at -20 °C to avoid repeat freeze-thaw cycles prior to downstream analyses.  Samples were then sequenced targeting the V3–V4 regions of the 16S rRNA gene using MiSeq technology as previously described [18, 19].

Data processing: Filtering the reads, dereplication and clustering

Paired end fastq files were assembled using FLASH [20]. A total of 7518258 joined reads were filtered and then analyzed in QIIME by choosing chimera slayer for  removing chimera and Uclust [16, 20] for Operational Taxonomic Units (OTU) extraction as described previously [18, 19]. All reads were clustered with a threshold of 97% identity to obtain OTU. Extracted OTUs were blasted against SILVA123 SSU database [21] of release and taxonomy were assigned to a species if they matched one with at least 97% identity, as previously described [22, 23]. Briefly, for each OTU, representative sequences were extracted and were searched against the reference database. For each unique representative sequence, we extracted the best matches from the reference database and sorted them by decreasing percentage of similarity rounded to the nearest integer. We used the reference sequences with >97% similarity (or the highest available) for taxonomic assignments into species. When multiple matches with the same percentage of similarity were present, the taxonomy of each rank was obtained by consensus [16, 24]. OTU not assigned to any species were considered "unidentified". As several OTUs matched identical species, the total number of identified species and the number of unidentified OTU was expected to be smaller than the total number of OTUs.


King Abdulaziz City for Science and Technology, Award: AR-34-191