Data from: EEG sonification improves sleep staging performance in novice stagers

Chin, Sam1; Whitmore, Nathan 1 ; Perry, Nathan1; Paradiso, Joe1; Maes, Pattie1

Published Aug 13, 2025 on Dryad. https://doi.org/10.5061/dryad.3bk3j9kz0

Data files

Aug 13, 2025 version files 746.71 KB

data_cleaning.R

1.25 KB
packages.R

348 B
qdf.csv

11.09 KB
rdf.csv

713.20 KB
README.md

4.40 KB
soundspindle.R

16.21 KB
SoundspindleExperiment.Rproj

205 B

Abstract

Sleep staging is a critical tool used in research and clinical settings to evaluate and diagnose sleep conditions; however, sleep staging is labor intensive and may be challenging for inexperienced practitioners. We explored whether adding an auditory representation (sonification) of the EEG to a standard visual representation could improve sleep staging performance or reduce workload. This is the first study to investigate the effects of sonification on sleep staging performance. We performed a within-subjects study in which 40 participants completed an online sleep staging task with and without sonified EEG. EEG was sonified by minimal transformation in which the raw EEG signal was played as an audio signal. Contrary to our hypothesis, we found that adding sonification did not result in improvements in accuracy, speed, or workload for the entire subject group. However, when we stratified participants by sleep staging experience, we found sonification improved accuracy for the least experienced participants. These findings suggest EEG sonification may be useful as a tool to enable novice sleep stagers to reach acceptable performance levels faster.

This repository contains the responses given by participants and their accuracy, along with demogrpahic, survey information, and R files needed to replicate our analysis.

Dataset DOI: 10.5061/dryad.3bk3j9kz0

Description of the data and file structure

Two files are used for analysis. The qdf.csv file contains data collected after the end of each experimental block, such as the participant's accuracy and their TLX scores after completing the block. The rdf.csv file contains this data, and additionally contains a row for each individual response the participant gave. Missing values are represented as empty columns. Each participant has three rows, which contains their information for the first block (training block), second, and third block of the experiment

Some information is only present for specific blocks. Block 0 is a training block, and participants did not complete the task load index (TLX) after this block; we also did not compute kappa for this block as participants were presented with each question until they got it correct. Some questions regarding overall experience were asked only after block 2.

Columns in the qdf2 file

pid--unique ID identifying the participant across blocks
TLX* columns--represent scores on each axis of the NASA Task Load Index
heard_sounds --represents whether the participant reported hearing sonification sounds after the conclusion of the block (asked only after block 2)
sounds_useful--participant rating of whether the sounds were useful for sleep staging. Options include "Yes, a lot", "Yes, a little", "Unsure", and "No" (asked only after block 2)
Future_interest--whether the participant would be interested in trying software that sonifies sleep for staging in the future (asked only after block 2)
soundOrder--in which block did they receive sounds
block--which block does this row correspond to.
rts--median time (seconds) taken to score each epoch (not logged for some of the earliest responses)
age, gender--Age in years. Gender could be male, female, or non-binary.
Experience--self-described sleep staging experience. Options are beginner, intermediate, or experienced
num_staged--number of polysomnograms staged in the past year. Options are none, 1-10,11-20,21-40,41-80, or more than 80
condition--what experimental condition does this row correspond to? Sound training is the initial practice blocks, the other two are assessment blocks in the "sound" or "visual only" condition
kappa--Cohen's kappa for sleep staging performance in this block

The rdf file contains these columns, but since each row represents one sleep stage given by the user, it contains additional columns describing the staging

user_stage--the sleep stage selected by the user. 0-wake, 4=REM, and all others are nREM stages 1-3
correctStage--the true sleep stage
timeSpent--seconds taken from when the stage was shown to when the user gave a stage
totalClicks--number of times the user clicked back or forward in time when staging
question--ID number for this specific item within the block
isCorrect--whether the user's response was correct (1=yes, 0=no)

Code/software

No special software is needed to view the files, but the R scripts in the rscripts folder can be used to recreate the analyses in the paper in R 4.4.1

The important r packages are:

data_cleaning.R--reads the qdf2 and rdf2 files located in the same directory as the script and prepares data frames for analysis

soundspindle.R--generates the graphs and analyses shown in the paper using the data frames created by data_cleaning.R

packages.R --required dependencies to run the analysis and data cleaning

SoundspindleExperiment.Rproj --R studio workspace

Human subjects data

We have inspected the data to ensure that no information as names, emails, IPs, or other identifiers are present.

All participants consented to the following provision in the consent form:

Your non-identifiable data such as responses to questions collected as part of the research will be stored, used for future research studies, and may be shared with other researchers for future research studies without additional informed consent from you or your legally authorized representative. Your data might be shared with academic research institutions, non-profit entities, and/or for-profit entities Your data may also be made publicly available in research data repositories such as the Open Science Framework.