Effectiveness of screening and ultra-brief intervention for hazardous drinking in primary care: pragmatic cluster randomised controlled trial

So, Ryuhei 1 2 3 4 ; EASY study group

Published Sep 26, 2025 on Dryad. https://doi.org/10.5061/dryad.866t1g22m

Abstract

Context

Hazardous drinking affects about one in five primary‑care patients. The EASY study compared a ≤1‑minute ultra‑brief intervention (Ultra‑BI) with simplified assessment only (SAO) across 40 primary care clinics in Japan.

Objective

To provide open access to datasets and SAS / R scripts needed to reproduce the published analyses and support future studies including independent participant data meta-analyses.

Datasets and scripts description

RawData.csv – unprocessed participant-level data at baseline, 12 weeks, and 24 weeks
Baseline.xlsx –unprocessed participant-level data at baseline for SAS
data_EASY_cluster_RCT_preprocessed.csv / .RData – processed versions of RawData.csv
Baseline.xlsx –unprocessed participant-level data at baseline for SAS
Scr_Select.csv – flag indicating participants recruited at clinics that restricted screening to patients suspected of hazardous drinking
Six analysis scripts and one CONSORT-diagram script

Key variables

participant id, clinic id, allocation group, AUDIT-C score (0–12), total alcohol consumption (grams per 4 weeks), WHO drinking risk level (DRL), readiness-to-change score, sex, five-year age band, comorbidities, smoking status, visit type, and participant reported recognition of receiving advice.

Reuse potential

The dataset and scripts enables full replication of the published analyses, and pooled effect estimates in individual participant data meta-analysis.

Ethical consideration

Public data sharing was conducted under IRB-approved written informed consent and public disclosure, with participants given the opportunity to decline data sharing. All data were de-identified prior to release, including the removal of direct identifiers and the aggregation or masking of potentially re-identifiable information.

Dataset DOI: 10.5061/dryad.866t1g22m

Dataset abstract

This dataset contains de-identified, participant-level records from 1,136 adults (20–74 years) screened for hazardous drinking across 40 primary-care clinics in Okayama, Hyogo, Osaka, and Hiroshima prefectures, Japan, between 29 June and 7 August 2023. Data are provided as rectangular tables in CSV, XLSX, and RData formats capturing baseline demographics, medical histories, readiness-to-change scales (1–4 ordinal items), AUDIT-C sub-scores (0–4) and total scores (0–12), and ethanol consumption amounts expressed in grams per 4-week period at baseline, 12 weeks, and 24 weeks. The deposit also includes a preprocessed analysis-ready table with derived indicators (e.g., per-protocol flag, intervention receipt percentage, patient-reported advice) and R scripts/SAS programs used to reproduce the trial tables.

The files support reuse for evaluating alcohol screening and ultra-brief interventions, examining behavioural readiness trajectories, benchmarking cluster-randomised trial analyses, and developing replication or secondary analyses. All direct identifiers have been removed, participant and facility IDs are pseudonymised, and age is supplied as bands or integers with no dates, aligning with Dryad's human-subject data guidelines and the informed-consent provisions for public data sharing. Users should note that missing values appear as blank cells (import as NA in R) and that binary indicators consistently use 0 = No / 1 = Yes unless otherwise specified.

Description of the data and file structure

We collected these data during a two-arm cluster randomised controlled trial that evaluated an ultra-brief alcohol intervention (Ultra-BI) versus simplified assessment only (SAO) in routine primary-care settings. Forty outpatient clinics from urban, suburban, and rural areas of Okayama, Hyogo, Osaka, and Hiroshima prefectures were invited and randomly allocated (block design, computer-generated sequence) before recruitment began. Between 29 June and 7 August 2023, reception staff consecutively screened all attending patients aged 20–74 years for basic eligibility, obtained written informed consent, and administered a baseline questionnaire that included the AUDIT-C. Patients meeting hazardous-drinking thresholds received the Ultra-BI immediately or usual care, according to cluster assignment. Follow-up questionnaires capturing alcohol consumption and readiness to change lifestyle behaviours were distributed at 12 and 24 weeks by post (paper or QR-linked web form) with SMS reminders; non-responders were contacted by an independent survey company, which also double-entered and validated all screening and follow-up data.

Dataset descriptions

Unless noted otherwise, missing values in the CSV files are encoded as . (period). When importing to analysis software, treat . as NA and cast numeric fields accordingly. Blank cells in the XLSX workbook represent missing data. All participant and clinic identifiers are pseudonymised and consistent across files where the column exists.

RawData.csv (1136 rows × 22 columns)

Participant-level baseline and follow-up responses exported from the trial database in the original coding used for monitoring.

id — integer pseudonymous participant identifier.
w0_PerProtocol — string; '1' marks participants meeting the pre-specified per-protocol criteria (1130 rows), '.' indicates not assessed/not in the per-protocol set (3 rows).
w0_Sex — integer; 1 = Male, 2 = Female.
w0_VisitHistory — categorical text; First visit, Routine appointments, or Visit as needed; '.' = not recorded.
w0_ConsideringDietChange — categorical text with readiness options No Improvement Needed, No intention to improve, Intending to improve, Already working on improvement; '.' = missing.
w0_ConsideringSmokingChange — categorical text with options Never smoked, Smoked but quit, Intending to improve, No intention to improve; '.' = missing.
w0_AUDIT1 — integer 1–4; Alcohol Use Disorders Identification Test (AUDIT-C) item 1 (drinking frequency).
w0_AUDIT2 — string digits 0–4; AUDIT-C item 2 (usual quantity). Convert to integer after treating '.' as missing.
w0_AUDIT3 — string digits 0–4; AUDIT-C item 3 (frequency of heavy drinking). '.' = missing.
w0_ConsideringDrinkingChange — ordinal code 1–5: 1 = No Improvement Needed, 2 = No intention to improve, 3 = Interested but no intention to improve, 4 = Intending to improve, 5 = Already working on improvement; '.' = missing.
w0_Allocation — integer cluster assignment (0 = Simplified assessment only (SAO), 1 = Ultra-brief intervention (Ultra-BI)).
FacilityID — integer string 1–40; pseudonymised clinic identifier.
w0_DrinkingAmountPer4weeks — numeric stored as string; ethanol consumption in grams per 4-week period at baseline ('.' = missing).
w0_SmokingStatus — integer string; 1 = Never smoked, 2 = Smoked but quit, 3 = Current smoker, '.' = missing.
w12_ConsideringDietChange — same categories as w0_ConsideringDietChange; '.' = missing.
w12_ConsideringSmokingChange — same categories as w0_ConsideringSmokingChange; '.' = missing.
w12_ConsideringDrinkingChange — ordinal code 1 = No Improvement Needed, 2 = No intention to improve, 3 = Intending to improve, 4 = Already working on improvement; '.' = missing.
w12_DrinkingAmountPer4weeks — numeric stored as string; grams of ethanol per 4 weeks at 12-week follow-up ('.' = missing, '0' indicates abstinent).
w24_ConsideringDietChange — same categories as w0_ConsideringDietChange; '.' = missing.
w24_ConsideringSmokingChange — same categories as w0_ConsideringSmokingChange; '.' = missing.
w24_ConsideringDrinkingChange — ordinal code with the same mapping as at 12 weeks (1 = No Improvement Needed … 4 = Already working on improvement); '.' = missing.
w24_DrinkingAmountPer4weeks — numeric stored as string; grams of ethanol per 4 weeks at 24-week follow-up ('.' = missing).

data_EASY_cluster_RCT_preprocessed.csv (1133 rows × 24 columns)

Analysis-ready participant-level dataset restricted to the per-protocol set and enhanced with derived variables. Values are cleaned and typed for immediate use.

w0_PerProtocol — numeric; 1 for the 1130 participants in the per-protocol set, NA for the 3 participants excluded from that set.
w0_Sex — categorical text; Male or Female.
w0_Age — character string five-year band (20–29, 30–39, …, 70–74).
w0_VisitHistory — categorical text; First visit, Routine appointments, Visit as needed, or missing (NA).
w0_PH_Hypertension — boolean; history of hypertension (TRUE/FALSE).
w0_PH_Diabetes — boolean; history of diabetes.
w0_PH_Gout — boolean; history of gout.
w0_PH_Dyslipidemia — boolean; history of dyslipidaemia.
w0_PH_LiverDisease — boolean; history of liver disease.
w0_PH_DigestiveDisease — boolean; history of digestive disease.
w0_ConsideringDrinkingChange — categorical text; No Improvement Needed, No intention to improve, Interested but no intention to improve, Intending to improve, Already working on improvement.
w0_Allocation — numeric; 0 = SAO, 1 = Ultra-BI.
FacilityID — character string 1–40; pseudonymised clinic ID (matching RawData.csv).
w0_AgeCont — numeric; exact age in years (includes .5 where ages were rounded to half-years).
w0_AUDIT_c — numeric; AUDIT-C total score (0–12).
w0_DrinkingAmountPer4weeks — numeric; grams of ethanol per 4 weeks at baseline.
w0_SmokingStatus — ordered factor; Never smoked, Smoked but quit, Smoking.
invited — categorical text describing clinic invitation coverage; values are Invited <50% of eligible patients, Invited about 60% of eligible patients, Invited about 70% of eligible patients, Invited about 80% of eligible patients, Invited about 90% of eligible patients, Invited ~100% of eligible patients, or Selected only patients likely to drink heavily.
w12_ConsideringDrinkingChange — categorical text; No Improvement Needed, No intention to improve, Intending to improve, Already working on improvement.
w12_DrinkingAmountPer4weeks — numeric; grams of ethanol per 4 weeks at 12-week follow-up.
w24_ConsideringDrinkingChange — categorical text; same scale as at 12 weeks.
w24_DrinkingAmountPer4weeks — numeric; grams of ethanol per 4 weeks at 24-week follow-up.
Received_Percentage — numeric; within-clinic percentage (0–100) of participants who reported receiving the intervention (calculated for intervention clinics only; missing for control clinics).
Received_patient_report — boolean; TRUE if the participant reported receiving counselling/advice, otherwise FALSE or missing.

data_EASY_cluster_RCT_preprocessed.RData

R binary file containing a tibble named df with the same 1133 rows and 24 columns as data_EASY_cluster_RCT_preprocessed.csv. Variable types follow the descriptors above.

Baseline.xlsx (1136 rows × 18 columns)

Excel workbook with baseline questionnaire data in human-readable labels (all categorical responses are now supplied in English). String categories correspond to the coded variables in RawData.csv.

id — pseudonymous participant identifier (matches RawData.csv).
Sex — Male / Female.
Age — decade band (20's, 30's, …, 70's).
VisitHistory — clinic attendance description: First visit, Routine appointments, or Visit as needed.
PH_Hypertension — logical; history of hypertension (TRUE/FALSE).
PH_Diabetes — logical; history of diabetes.
PH_Gout — logical; history of gout.
PH_Dyslipidemia — logical; history of dyslipidaemia.
PH_LiverDisease — logical; history of liver disease.
PH_DigestiveDisease — logical; history of digestive disease.
ConsideringDietChange — readiness statements: No Improvement Needed, No intention to improve, Intending to improve, Already working on improvement.
ConsideringSmokingChange — readiness statements: Never smoked, Smoked but quit, No intention to improve, Intending to improve.
AUDIT1 — integer 0–4; AUDIT-C item 1.
AUDIT2 — integer 0–4; AUDIT-C item 2.
AUDIT3 — integer 0–4; AUDIT-C item 3.
ConsideringDrinkingChange — readiness statements aligned with the numeric codes in RawData.csv (No Improvement Needed, No intention to improve, Interested but no intention to improve, Intending to improve, Already working on improvement).
Allocation — integer (0 = SAO, 1 = Ultra-BI).
FacilityID — integer 1–40; pseudonymised clinic identifier.

Scr_Select.csv (1133 rows × 2 columns)

Link table flagging clinics that did not invite every eligible patient.

id — pseudonymous participant identifier (matches RawData.csv and Baseline.xlsx).
w0_biased_invitation — boolean; TRUE when the clinic reported selectively screening only patients suspected of hazardous drinking, FALSE for universal screening invitation.

Abbreviations

AUDIT-C: Alcohol Use Disorders Identification Test, consumption items
WHO DRL: WHO Drinking-Risk Level (Low / Medium / High / Very high)
Ultra-BI: ultra-brief intervention
SAO: Simplified assessment only

Code/software

table1.r

This script aggregates data for creating Table 1 (Baseline characteristics per participant by allocation group) of the paper.

stable1.r

This script aggregates data for creating sTable 1 (Baseline characteristics per cluster by allocation group) of the paper.

table2_and_4.sas

This script estimates the Local Average Treatment Effect (LATE) for creating Table 2 (Baseline characteristics per cluster by allocation group) of the paper.

table2_sensitivity_analyses.r

This script estimates the Local Average Treatment Effect (LATE) for creating Table 2 (Baseline characteristics per cluster by allocation group) of the paper.

calculate_effect_sizes.r

This script calculates Hedges’ g for outcome variables presented in Table 2 and 4.

table3.r

This script aggregates data for creating Table 3 (Proportion of WHO drinking risk level at follow-ups) of the paper.

table5.r

This script aggregates data for creating Table 5 ( Proportion of readiness to change drinking habits by category) of the paper.

CONSORT_flowchart.r

This script aggregates data for creating Figure 2 (CONSORT flow diagram of clusters and patients through the trial) of the paper.

Access information

Other publicly accessible locations of the data:

Data was derived from the following sources:

Effectiveness of screening and ultra-brief intervention for hazardous drinking in primary care: pragmatic cluster randomised controlled trial

Data files

Abstract

Dataset abstract

Description of the data and file structure

Dataset descriptions

RawData.csv (1136 rows × 22 columns)

data_EASY_cluster_RCT_preprocessed.csv (1133 rows × 24 columns)

data_EASY_cluster_RCT_preprocessed.RData

Baseline.xlsx (1136 rows × 18 columns)

Scr_Select.csv (1133 rows × 2 columns)

Abbreviations

Code/software

table1.r

stable1.r

table2_and_4.sas

table2_sensitivity_analyses.r

calculate_effect_sizes.r

table3.r

table5.r

CONSORT_flowchart.r

Access information

Human subjects data

Effectiveness of screening and ultra-brief intervention for hazardous drinking in primary care: pragmatic cluster randomised controlled trial

Data files

Abstract

README: Effectiveness of screening and ultra-brief intervention for hazardous drinking in primary care: a pragmatic cluster randomised controlled trial

Dataset abstract

Description of the data and file structure

Dataset descriptions

RawData.csv (1136 rows × 22 columns)

data_EASY_cluster_RCT_preprocessed.csv (1133 rows × 24 columns)

data_EASY_cluster_RCT_preprocessed.RData

Baseline.xlsx (1136 rows × 18 columns)

Scr_Select.csv (1133 rows × 2 columns)

Abbreviations

Code/software

table1.r

stable1.r

table2_and_4.sas

table2_sensitivity_analyses.r

calculate_effect_sizes.r

table3.r

table5.r

CONSORT_flowchart.r

Access information

Human subjects data