Skip to main content
Dryad

RadCases evaluation results: Evaluating acute image ordering for real-world patient cases via language model alignment with radiological guidelines

Data files

Jul 23, 2025 version files 21.16 MB

Abstract

Background: Diagnostic imaging studies are increasingly important in the management of acutely presenting patients. However, ordering appropriate imaging studies in the emergency department is a challenging task with a high degree of variability between healthcare providers. To address this issue, recent work has investigated whether generative AI and large language models can be leveraged to recommend diagnostic imaging studies in accordance with evidence-based medical guidelines. However, it remains challenging to ensure that these tools can provide recommendations that correctly align with medical guidelines, especially given the limited diagnostic information available in acute care settings.

Methods: In this study, we introduce a framework to intelligently leverage language models by recommending imaging studies for patient cases that are aligned with the American College of Radiology’s Appropriateness Criteria, a set of evidence-based guidelines. To power our experiments, we make available RadCases, a novel dataset of over 1500 annotated case summaries reflecting common patient presentations, and apply our framework to enable state-of-the-art language models to reason about appropriate imaging choices.

Results: We leverage our framework to enable state-of-the-art language models to achieve an accuracy on par with clinicians in image ordering. Furthermore, we demonstrate that our language model-based pipeline can be used as an intelligent assistant by clinicians to support image ordering workflows and improve the accuracy of acute image ordering according to the American College of Radiology’s Appropriateness Criteria.

Conclusions: Our work demonstrates and validates a strategy to leverage AI-based software to improve trustworthy clinical decision making in alignment with expert evidence-based guidelines.