Data from: The characteristics of patients’ medical care, living will, and signs of death by age and the place of death: A cross-sectional study using a questionnaire survey targeting physicians with expertise in end-of-life care

Yakabe, Mitsutaka 1 ; Hosoi, Tatsuya1; Matsumoto, Shoya1; Miura, Shoho1; Hoshi, Kazuhiro1; Iwao, Soichiro2; Kitamura, Yoshihiro2 3; Akishita, Masahiro1 4; Ogawa, Sumito1

Published Feb 25, 2026 on Dryad. https://doi.org/10.5061/dryad.5mkkwh7jp

Data files

Feb 25, 2026 version files 133.16 KB

code.R

13 KB
data_dictionary.csv

10.37 KB
Data.csv

25.44 KB
questionnaire.pdf

76.42 KB
README.md

7.94 KB

Abstract

This dataset contains the complete, de-identified, participant-level data (n = 440) used for the analyses in the associated article. Records were obtained via a structured questionnaire survey and include age at death (years), sex, place of death, medical treatments received near the end of life, presence of a living will, and end-of-life signs observed by respondents. Data are provided as a tidy CSV (one row per patient) with an accompanying data dictionary and README describing variable definitions. The dataset enables replication of the published analyses and supports secondary reuse such as prevalence estimation, development of prognostic models for dying trajectories, and cross-setting comparisons. Direct identifiers have been removed, and quasi-identifiers have been generalized to reduce re-identification risk; only the minimally necessary data are included. Ethical considerations and consent procedures are described in the associated article; the Dryad deposit contains de-identified data suitable for public sharing.

Dataset DOI: 10.5061/dryad.5mkkwh7jp

Description of the data and file structure

These data were collected to characterize how demographic factors, place of death, and medical care relate to observed signs of dying. We conducted a cross-sectional questionnaire survey of physicians who had recently provided end-of-life care, asking each respondent to report on one decedent they had cared for within the past five years. The instrument captured age at death, sex, care setting/place of death, treatments and supportive or palliative measures provided, presence of a living will, and specific end-of-life signs observed in the final days or hours. Responses were entered via a secure web form and compiled into a tidy, participant-level dataset suitable for replication and secondary analyses.

To reduce re-identification risk while preserving analytical value, the public dataset has been de-identified as follows: direct/pseudo-identifiers were removed, age was grouped into ranges, and selected categorical variables were numerically encoded. In addition, timing variables were collapsed into broader ordered categories and stored as ordinal codes. Consistent with de-identification practice, the code keys for selected sensitive variables are not provided in the public README or dataset documentation.

Files in this deposit

• Data.csv — De-identified participant-level dataset (UTF-8 CSV). The public version contains de-identification-safe encodings (e.g., grouped age ranges and numeric codes for selected categorical variables). Missing values are represented as empty cells.

• data_dictionary.csv — Machine-readable data dictionary aligned with Data.csv (variable names, labels, data types, and notes). For selected sensitive variables, code meanings are intentionally not disclosed in the public documentation to reduce re-identification risk.

• questionnaire.pdf — Full text of the self-developed questionnaire used in the study (item wording and response options; English translation if applicable).

• code.R — Annotated R script that reads Data.csv and reproduces the main analyses/tables described in the manuscript (e.g., logistic regression, multiplicity control via FDR and Bonferroni). Uses base R functions; no proprietary software is required.

Variables (Data.csv)

Record identifier
• ID is not included in the public dataset (removed during de-identification).

Demographic and care-related variables
• age: Age at death, grouped into ranges (e.g., 10-year bands, with a top-coded upper category). Type: categorical/ordinal string.
• sex: Biological sex. Public file stores de-identified numeric code(s). Type: categorical (integer code).
• place: Place of death (care setting). Public file stores de-identified numeric code(s). Type: categorical (integer code).
• hot: Home oxygen therapy (HOT) before death. Public file stores de-identified numeric code(s). Type: binary indicator (integer code).
• hot_when: If HOT was provided, time from initiation to death. Public file stores broader grouped timing categories as ordinal code(s). Type: ordinal (integer code).
• pain: Use of pain management. Public file stores de-identified numeric code(s). Type: binary indicator (integer code).
• pain_when: If pain management was provided, time from initiation to death. Public file stores broader grouped timing categories as ordinal code(s). Type: ordinal (integer code).
• palli: Receipt of palliative care. Public file stores de-identified numeric code(s). Type: binary indicator (integer code).
• palli_when: If palliative care was provided, time from initiation to death. Public file stores broader grouped timing categories as ordinal code(s). Type: ordinal (integer code).
• will: Presence of a living will. Public file stores de-identified numeric code(s). Type: binary indicator (integer code).

End-of-life signs (s1–s18)
All end-of-life sign variables are binary indicators and are stored in the public file as de-identified numeric code(s) rather than human-readable labels.

• s1: Lower level of consciousness and uttering incoherent statements.
• s2: Spending extended periods sleeping.
• s3: Experiencing difficulty consuming meals.
• s4: Struggling with drinking, with increased risk of aspiration.
• s5: Displaying agitation and heightened limb movement.
• s6: Decreased verbal communication, with diminished auditory clarity.
• s7: Appearing dazed and devoid of expression.
• s8: Experiencing urinary and fecal incontinence, with inability to reach the bathroom.
• s9: Diminished responsiveness to calls.
• s10: Impaired speech clarity.
• s11: Exhibiting distress due to shallow breathing.
• s12: Difficulty in detecting pulse or experiencing a drop in blood pressure.
• s13: Limbs becoming cold.
• s14: Development of cyanosis on the face or extremities.
• s15: Sensation of warmth and reluctance to wear clothing.
• s16: Experiencing agitation or delirium.
• s17: Slight limb movements despite being bedridden.
• s18: Tremors or shivering of coverings draped over the body.

Missing data

Missing values are represented as empty cells (no “NA” literal). For timing variables (hot_when, pain_when, palli_when), empty cells occur in two situations: (i) the corresponding care was not provided (not applicable), or (ii) care was provided but the timing is unknown. The former can be inferred from the corresponding parent indicator variable.

Notes on de-identification and interpretation

The public dataset is designed to preserve analytical utility while reducing re-identification risk. Accordingly, some original response detail (e.g., exact age in years and fine-grained timing categories) has been coarsened in the public release. Numeric encodings are used for selected categorical variables, and the corresponding code keys are intentionally not disclosed in the public files.

Code/software

Software needed to view the data

The dataset is supplied as UTF-8–encoded CSV files that can be opened with any spreadsheet or text editor, or analyzed with the free, open-source R environment.

Analysis software

All statistical analyses reported in the associated article were performed using R version 3.3.3 (R Foundation for Statistical Computing, Vienna, Austria). Two-sided P < 0.05 was considered statistically significant. Only base R functionality (e.g., stats, utils) was used; no proprietary software is required.

Code availability

This deposit includes code.R, which reproduces the main analyses and tables from the manuscript using Data.csv. Researchers may also re-implement the models described in the Methods in R 3.3.3 or later, using the variable definitions provided in data_dictionary.csv and questionnaire.pdf.

Typical workflow (for reuse)

Open Data.csv.
Consult data_dictionary.csv and questionnaire.pdf for variable definitions and item mapping.
Run code.R to reproduce the main analyses and output tables/figures; alternatively, re-implement the models (e.g., logistic regression via glm(family = binomial)).
Export tables/figures as needed.

Human subjects data

This dataset contains de-identified human subjects data derived from a questionnaire survey of 440 decedents’ cases reported by physicians. We obtained explicit consent from participants (via the Japan Society for Dying with Dignity, JSDD) to deposit the de-identified dataset in a public repository upon publication. All direct identifiers were removed (e.g., names, addresses, contact details, medical record numbers), and free-text fields were deleted.