Inner speech in motor cortex and implications for speech neuroprostheses
Data files
Jul 30, 2025 version files 10.80 GB
-
chittychittybangbang.zip
1.11 GB
-
conjunctiveCounting.zip
322.39 MB
-
interleavedVerbalBehaviors.zip
777.42 MB
-
isolatedVerbalBehaviors.zip
3.53 GB
-
README.md
7.04 KB
-
sentenceData.zip
3.93 GB
-
seqRecallSpeech.zip
79.44 MB
-
seqRecallUninstructed3ElementArrow.zip
364.98 MB
-
seqRecallUninstructed3ElementLines.zip
371.09 MB
-
seqRecallUninstructedSingleArrow.zip
38.77 MB
-
seqRecallVerbalMemory.zip
175.49 MB
-
seqRecallVisualMemory.zip
105.18 MB
Abstract
Speech brain-computer interfaces (BCIs) show promise in restoring communication to people with paralysis, but have also prompted discussions regarding their potential to decode private inner speech. Separately, inner speech may be a way to bypass the current approach of requiring speech BCI users to physically attempt speech, which is fatiguing and can slow communication. Using multi-unit recordings from four participants, we found that inner speech is robustly represented in motor cortex, and that imagined sentences can be decoded in real-time. The representation of inner speech was highly correlated with attempted speech, though we also identified a neural “motor-intent” dimension that differentiates the two. We investigated the possibility of decoding private inner speech and found that some aspects of free-form inner speech could be decoded during sequence recall and counting tasks. Finally, we demonstrate high-fidelity strategies that prevent speech BCIs from unintentionally decoding private inner speech.
Dataset DOI: 10.5061/dryad.gf1vhhn1j
Description of the data and file structure
Brain-computer interfaces (BCIs) can restore communication to people who have lost the ability to move or speak. In this study, we demonstrated an intracortical BCI that decodes inner speech from neural activity in the motor cortex and translates it to text in real-time, using a recurrent neural network decoding approach. We also demonstrate instances in which inner speech can be decoded during cognitive tasks that naturally elicit the use of inner speech. Further, we demonstrate robust strategies to ensure only inner speech that is intended to be said aloud is decoded.
This dataset contains all of the neural activity recorded during these experiments, consisting of sentences as well as instructed delay experiments designed to investigate the neural representation of inner speech in the motor cortex and its relationship to that of attempted speech.
Files and variables
Each folder contains a README file within.
File: chittychittybangbang.zip
Description: contains data from the instructed delay inner speech production task in which the participant imagined speaking sentences with or without a keyword preceding the utterance in order to demonstrate an online "keyword" system to lock and unlock decoding of inner speech with a BCI. This includes both training data and real-time BCI evaluation data. Data depicted in Figure 7.
File: conjunctiveCounting.zip
Description: contains data from the conjunctive counting task performed by two participants in the study. On each trial, the participant saw a grid consisting of two types of shapes, of two different colors. Participants were cued with a particular shape and color combo at the top of the screen and instructed to silently count the instances of that shape-color combination in the grid. There were between 10-20 instances depending on the individual trial. When done counting, the participants pushed a button and attempted to vocalize the number they counted. Then, they received feedback on whether they were correct or not. This data is depicted in Figure 5
File: interleavedVerbalBehaviors.zip
Description: contains data from blocks in which the participant performed the same 7-word and do nothing cue set as in the isolated verbal behaviors task; however, within each block an attempted, imagined, and listening cue types were intermixed.
These data (contained here in this folder) can be used to assess neural tuning relationships between attempted, imagined, and perceived speech behaviors. These were instructed delay experiments with text-based cues (or audio cues for listening). This data is depicted in Figure 6.
File: seqRecallUninstructedSingleArrow.zip
Description: contains data from an instructed delay task designed NOT to elicit inner speech during arrow-cued single joystick movements. It serves as an arrow-cue and motor control for the seqRecallUninstructed3ElementArrow
task. (Figure 4AB, Figure S3B)
File: isolatedVerbalBehaviors.zip
Description: contains data from isolated blocks of the participant performing seven verbal behaviors (attempted vocalized speech, attempted mimed speech, imagined motoric speech, imagined auditory speech, imagined listening, passive listening, and silent reading). These data (contained here in this folder) can be used to assess neural tuning to attempted, imagined, and perceived speech behaviors in different subregions of the motor cortex. These were instructed delay experiments with text-based cues (or audio cues for passive listening). This data is depicted in Figures 1 and 2.
File: seqRecallSpeech.zip
Description: contains data from an instructed delay task in which participants attempted to speak words associated with the directions of arrows to compare attempted speech behavior to the seqRecallVerbalmemory
task (Figures S2, S3EF).
File: seqRecallUninstructed3ElementArrow.zip
Description: contains data from an instructed delay task designed to elicit inner speech for the cognitive task of sequence recall without explicit instruction to engage inner speech (Fig 4AB Fig S3AB)
File: seqRecallVisualMemory.zip
Description: contains data from a task that probes the representation of instructed inner speech for sequence recall. When compared to the seqRecallVerbalMemory
task, it probes the volitional nature of inner speech for cognitive tasks. (Figure 4CD, Figure S3CD)
File: seqRecallUninstructed3ElementLines.zip
Description: contains data from an instructed delay task designed NOT to elicit inner speech during sequence recall when no explicit mental strategy instruction was given. It serves as a motor sequencing control for the seqRecallUninstructed3ElementArrow
task. (Fig 4AB Fig S3AB)
File: sentenceData.zip
Description: contains data from instructed delay sentence-production experiments. On each trial, the participant first saw a red square with a sentence above it (see the "sentenceText" variable). Then, when the square turned green, the participant either attempted to speak that sentence normally or without vocalization, or imagined speaking the sentence with their preferred inner speech strategy. When finished, the participant either pushed a button (T12) or indicated to the researcher to push the button by looking at them (T16), or used an eye tracker to push a button on the screen (T15). Data was collected in a series of 'blocks' (20-50 sentences in each block), in between which the participant rested. Neural data between blocks was not recorded. During some of the blocks, the sentences were decoded in real-time, and the output of the decoder was displayed on the screen. Some sets of sentences were from a small vocabulary of 50 words, and some were from an open vocabulary of approximately 130,000 words. Files are named to indicate participant session, behavior performed, vocabulary size, and whether it was open or closed-loop (indicating sentences were being decoded in real-time). (Fig 3, 5, 7)
File: seqRecallVerbalMemory.zip
Description: contains data from a task that probes the representation of instructed inner speech for sequence recall. When compared to the seqRecallVisualMemory
task, it probes the volitional nature of inner speech for cognitive tasks. When compared to seqRecallSpeech
, it probes the relationship between inner speech for sequence recall and attempted speaking. (Figure 4CD, Figure S3C-F)
Code/software
The data consists of .mat files that are intended to be loaded with MATLAB or Python (scipy.io.loadmat). See the individual readme.txt files for a detailed description of the data contents and format.
Human subjects data
Data is anonymized and referenced by a subject code alone (participants T12,T15,T16,T17). Informed consent for data sharing was obtained.