Data from: Encoding of speech modes and loudness in ventral precentral gyrus

Srinivasan, Aparna 1 ; Wairagkar, Maitreyee1; Iacobacci, Carrina1; Hou, Xianda1; Card, Nicholas1; Jacques, Brandon2; Pritchard, Anna2; Bechefsky, Payton2; Hochberg, Leigh3; AuYong, Nicholas4; Pandarinath, Chethan2; Brandman, David1; Stavisky, Sergey1

Published Feb 23, 2026 on Dryad. https://doi.org/10.5061/dryad.2547d7x5w

Data files

Feb 23, 2026 version files 2.47 GB

README.md

3.40 KB
t15_sentence-loudness_eval.zip

50.82 MB
t15_word-loudness.zip

935.96 MB
t16_word-loudness.zip

1.48 GB

Abstract

The ability to vary the mode and loudness of speech is an important part of the expressive range of human vocal communication. However, the encoding of these behaviors in the ventral precentral gyrus (vPCG) has not been studied at the resolution of neuronal firing rates. We investigated this in two participants who had intracortical microelectrode arrays implanted in their vPCG as part of a speech neuroprosthesis clinical trial. Neuronal firing rates modulated strongly in vPCG as a function of attempted mimed, whispered, normal, or loud speech. At the neural ensemble level, mode/loudness and phonemic content were encoded in distinct neural subspaces. Attempted mode/loudness could be decoded from vPCG with 94 % and 89 % accuracy for the two participants, and corresponding neural preparatory activity at 640 ms and 270 ms before speech onset enabled 80 % decoding accuracy, respectively. We then developed a closed-loop loudness decoder that achieved 94 % online accuracy in modulating a brain-to-text speech neuroprosthesis output based on attempted loudness. These findings demonstrate the feasibility of decoding mode and loudness from vPCG, paving the way for speech neuroprostheses capable of synthesizing more expressive speech.

Overview

This repository contains the data necessary to reproduce the results of the manuscript "Encoding of speech modes and loudness in ventral precentral gyrus".

The code is written in Python and is hosted on Github.

The data can be downloaded from this Dryad repository. Please download this data and place it in the data directory of the GitHub code as detailed in the README. All included data has been anonymized and does not include any identifiable information.

Neural data

This dataset includes neural data from two participants, 'T15' and 'T16'. Each participant has four 64-electrode Utah arrays (256 electrodes total) implanted in their left precentral gyrus. Threshold crossings (-4.5 RMS threshold) and spike band power were extracted as neural features per electrode. These features were then binned every 10 ms, normalized, and smoothed.

Files

t15_word-loudness.zip: Participant T15's data from the word-loudness task used for offline analyses.

t16_word-loudness.zip: Participant T16's data from the word-loudness task used for offline analyses.

t15_sentence-loudness_eval.zip: Participant T15’s data from the sentence–loudness task during the evaluation blocks of closed-loop loudness decoding (Figure 4).

Each task was performed during a session on a scheduled day. Within each session, participants completed multiple blocks during which they performed the task. These blocks of data are made available as .mat files.

Each word-loudness .mat file contains the following variables per trial:

cue: Cue presented to the participant. E.g. "LOUD: have". The participant attempts to speak the word at the given loudness level.
threshcross: Binned and smoothed threshold crossings.
spikepow: Binned and smoothed spike band power.
raw_threshcross: Binned threshold crossings (unsmoothed).
delay_duration_ms: Duration of the delay period (ms).
speech_onsets and speech_offsets: Speech onset and offset times at 30kHz.

Each sentence-loudness .mat file contains the following variables per trial:

cue: Cue presented to the participant. E.g. "i CERTAINLY HOPE so". The participant attempts to speak the sentence with loudness modulation.
threshcross: Binned and smoothed threshold crossings.
spikepow: Binned and smoothed spike band power.
logits: Phoneme logit predictions from a brain-to-text RNN decoder.
decoder_partial_output and decoder_final_output: Closed-loop predictions from the brain-to-text BCI.
predicted_amplitude: Closed-loop loudness level predictions from a loudness decoder.
updated_decoder_partial_output and updated_decoder_final_output: Closed-loop brain-to-text predictions formatted according to decoded loudness.
metadata (neural_trackingIDs) used to track the neural data and their corresponding decoded outputs.

Human subjects data

These data recorded from human participants have been anonymized and de-identified. They do not contain any personally identifiable information. The subjects are referred to by their coded clinical trial identifiers. Raw neural data are not shared; only de-identified processed neural features. Behavioral data do not contain identifiable information. The participants have consented to the publication of this de-identified data in the public domain.