The "Quantifying representativeness in RCTs using ML fairness metrics - Data and codes" is used to quantify representativeness in randomized clinical trials (RCTs) and provide insights to improve the clinical trial equity and health equity. We developed RCT representativeness metrics based on Machine Learning (ML) Fairness Research. Visualizations and statistical tests based on proposed metrics enable researchers and physicians to rapidly visualize and assess subgroup representation in RCTs. The approach enables users to determine underrepresentation, absence, or other misrepresentation of subgroups indicating potential limitations of RCTs. The method could help support generalizability evaluation of existing RCT cohorts, enrollment target decisions for new RCTs (if eligibility criteria are included), and monitoring of RCT enrollment, ultimately contributing to more equitable public health outcomes. We apply the proposed RCT representativeness metrics to three landmark clinical trials released in the last decade: Action to Control Cardiovascular Risk in Diabetes (ACCOD), Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial (ALLHAT), and Systolic Blood Pressure Intervention Trial (SPRINT). This dataset contains the processed data and results for the experiments and visualization codes in the paper titled "Quantifying representativeness in randomized clinical trials using machine learning fairness metrics."

The raw NHANES and RCT datasets are downloaded directly from the websites.
The target population summary data are processed by following "NHANES Survey Methods and Analytic Guidelines" using R "haven" and "survey" packages.
The RCT sample summary data are calculated through R "count()" function.

All information are provided in the MIAOQI_QuantifyingRepresentativenessInRCTsUsingMLFairnessMetricsDataAndCodes_Readme.txt.

Quantifying representativeness in RCTs using ML fairness metrics - Data and codes

Data files

Abstract

Quantifying representativeness in RCTs using ML fairness metrics - Data and codes

Data files

Abstract

Methods

Usage notes

Works referencing this dataset