Skip to main content
Dryad

Machine learning feature data from EHR, labels, and estimates for next generation sequencing-based assay

Data files

Nov 28, 2024 version files 17.83 MB

Abstract

Next-generation sequencing-based tests have advanced the field of medical diagnostics, but their novelty and cost can lead to uncertainty in clinical deployment. The Heme-STAMP is one such assay that tracks mutations in genes implicated in hematolymphoid neoplasms. Rather than limiting its clinical usage or imposing rule-based criteria, we propose leveraging machine learning to guide clinical decision-making on whether this test should be ordered. We trained a machine learning model to predict the outcome of Heme-STAMP testing using 3,472 orders placed between May 2018 and September 2021 from an academic medical center and demonstrated how to integrate a custom machine learning model into a live clinical environment to obtain real-time model and physician estimates. The model predicted the results of a complex next-generation sequencing test with discriminatory power comparable to expert hematologists (AUC score: 0.77 [0.66, 0.87], 0.78 [0.68, 0.86] respectively) and with capacity to improve the calibration of human estimates.