Data from: Computational pathology to discriminate benign from malignant intraductal proliferations of the breast
Dong, Fei et al. (2015), Data from: Computational pathology to discriminate benign from malignant intraductal proliferations of the breast, Dryad, Dataset, https://doi.org/10.5061/dryad.pv85m
The categorization of intraductal proliferative lesions of the breast based on routine light microscopic examination of histopathologic sections is in many cases challenging, even for experienced pathologists. The development of computational tools to aid pathologists in the characterization of these lesions would have great diagnostic and clinical value. As a first step to address this issue, we evaluated the ability of computational image analysis to accurately classify DCIS and UDH and to stratify nuclear grade within DCIS. Using 116 breast biopsies diagnosed as DCIS or UDH from the Massachusetts General Hospital (MGH), we developed a computational method to extract 392 features corresponding to the mean and standard deviation in nuclear size and shape, intensity, and texture across 8 color channels. We used L1-regularized logistic regression to build classification models to discriminate DCIS from UDH. The top-performing model contained 22 active features and achieved an AUC of 0.95 in cross-validation on the MGH data-set. We applied this model to an external validation set of 51 breast biopsies diagnosed as DCIS or UDH from the Beth Israel Deaconess Medical Center, and the model achieved an AUC of 0.86. The top-performing model contained active features from all color-spaces and from the three classes of features (morphology, intensity, and texture), suggesting the value of each for prediction. We built models to stratify grade within DCIS and obtained strong performance for stratifying low nuclear grade vs. high nuclear grade DCIS (AUC = 0.98 in cross-validation) with only moderate performance for discriminating low nuclear grade vs. intermediate nuclear grade and intermediate nuclear grade vs. high nuclear grade DCIS (AUC = 0.83 and 0.69, respectively). These data show that computational pathology models can robustly discriminate benign from malignant intraductal proliferative lesions of the breast and may aid pathologists in the diagnosis and classification of these lesions.