Skip to main content

Data from: Artificial intelligence model for analyzing colonic endoscopy images to detect changes associated with irritable bowel syndrome

Cite this dataset

Mihara, Hiroshi (2022). Data from: Artificial intelligence model for analyzing colonic endoscopy images to detect changes associated with irritable bowel syndrome [Dataset]. Dryad.



IBS is not considered to be an organic disease and usually shows no abnormality on lower gastrointestinal endoscopy, although biofilm formation, dysbiosis, and histological microinflammation have recently been reported in patients with IBS. In this study, we investigated whether an artificial intelligence (AI) colorectal image model can identify minute endoscopic changes, which cannot typically be detected by human investigators, that are associated with IBS.

Study subjects were identified based on electronic medical records and categorized as IBS (Group I; n=11), IBS with predominant constipation (IBS-C; Group C; n=12), and IBS with predominant diarrhea (IBS-D; Group D; n=12). The study subjects had no other diseases. Colonoscopy images from IBS patients and from asymptomatic healthy subjects (Group N; n=88) were obtained. Google Cloud Platform AutoML Vision (single-label classification) was used to construct AI image models to calculate sensitivity, specificity, predictive value, and AUC. A total of 2479, 382, 538, and 484 images were randomly selected for Groups N, I, C and D, respectively.


The AUC of the model discriminating between Group N and I was 0.95. Sensitivity, specificity, positive predictive value, and negative predictive value of Group I detection were 30.8%, 97.6%, 66.7%, and 90.2%, respectively. The overall AUC of the model discriminating between Groups N, C, and D was 0.83; sensitivity, specificity, and positive predictive value of Group N were 87.5%, 46.2%, and 79.9%, respectively.


Using the image AI model, colonoscopy images of IBS could be discriminated from healthy subjects at AUC 0.95. Prospective studies are needed to further validate whether this externally validated model has similar diagnostic capabilities at other facilities and whether it can be used to determine treatment efficacy.


For use in real-world patients with IBS, patients were identified not by ROME criteria, but instead based on disease names recorded for insurance purposes between January 2010 and December 2020. These names included "Irritable bowel syndrome” (Group I), “constipated irritable bowel syndrome” (Group C), and “diarrhea irritable bowel syndrome” (Group D). Other diseases such as colorectal cancer, inflammatory bowel disease, and eosinophilic gastroenteritis were excluded based on symptoms and results of histopathological examinations. However, cases with nonspecific inflammatory cell infiltrates that did not meet the diagnostic criteria and were being followed up under the respective insurance disease names were included in the relevant group. For symptomatic patients, colonoscopy was done as part of a workup for changes in bowel habits (e.g., diarrhea), and asymptomatic patients had undergone colorectal cancer screening. Asymptomatic patients comprised Group N. Colonoscopy images were obtained from the endoscopy reporting system. Images were taken by more than 10 trainees or specialists at a single institution with an Olympus CF-HQ290Z or PCF-H290Z colonoscope. The accuracy of the model was improved by building the model multiple times after excluding normal light images of the terminal ileum, rectal inversion and anus, and narrow band or dye-spread images. No biofilm was detected. Images having scores of 2 (i.e., minor amount of residual staining, small fragments of stool and/or opaque liquid, with well-visualized mucosa of colon segments) and 3 (entire mucosa of the colon segment was well-visualized with no residual staining, small fragments of stool and/or opaque liquid) on the Boston Bowel Preparation Scale (BBPS) were employed. A total of 20 to 40 images were used, with about 5 images for regions in each segment (cecum, ascending, transverse, descending, sigmoid colons and rectum) per patient. Groups N, I, C, and D had 88, 11, 12, and 12 patients, respectively, for which 2,479, 382, 538, and 484 images, respectively, were used. The accuracy of the model increases with the number of patients, but a minimum of 100 images afforded a certain degree of accuracy. Thus, the number of patients and images used was considered sufficient to construct this model.

In this study, we used annotation and Algorithm Generation using Google Cloud AutoML Vision from the Google Cloud Platform (GCP) (Google, Inc.). Four labels were defined as Groups N, I, C and D in the training dataset (single label classification). Three models were produced that differentiated Group I and N, Group N, C, and D, or Group C and D. This process was done entirely by a single physician (HM).

Usage notes

The file can be opened in any imaging software that can open JPG files.