Skip to main content
Dryad

The governance of health data in the AI era: A scoping review and computational topic model of the global research landscape

Data files

May 22, 2026 version files 16.84 KB

Click names to download individual files

Abstract

Background: The integration of artificial intelligence (AI) into healthcare is critically dependent on vast quantities of patient data, igniting an urgent global debate on data ownership, privacy, and governance. While numerous perspectives exist, the empirical structure and evolution of this scholarly discourse remain uncharacterized. We aimed to systematically map the conceptual landscape of research on AI and health data governance to identify its core themes, temporal trends, and key focus areas.

Methods: We conducted a scoping review according to PRISMA guidelines, searching PubMed, Scopus, and Web of Science for peer-reviewed articles published between Jan 1, 2018, and May 31, 2025. We performed a descriptive analysis of publication trends. Using Latent Dirichlet Allocation (LDA), we applied computational topic modelling to the abstracts, which serve as concise summaries of each article's core contributions, to identify latent thematic structures. Topic trends were analyzed using linear regression.

Findings: 43 articles met the inclusion criteria. The volume of publications has increased substantially since 2018. Our LDA analysis identified five distinct research topics: (1) AI Applications & Ownership, (2) AI Models & Data Privacy, (3) Data Sharing Platforms & Technology, (4) Ethical & Legal Concerns, and (5) AI Development & Implementation. Over the study period, research on "Ethical & Legal Concerns" showed a statistically significant increasing trend in prevalence (slope=0.023, p=0.008), becoming the most dominant topic in recent years.

Interpretation:The scholarly discourse on AI and health data has matured, shifting from foundational questions of technical implementation towards a dominant focus on complex ethical and legal challenges. This data-driven evidence signals an urgent need for clinical leaders and policymakers to move beyond theoretical discussions and implement robust, practical governance frameworks. Failure to address this governance gap risks impeding trustworthy AI innovation and eroding public trust, thereby limiting the potential of AI to improve patient outcomes equitably.