Complementary cortical and thalamic contributions to cell-type-specific striatal activity dynamics during movement

Gjoni, Enida 1 ; Sristi, Ram Dyuthi 1 ; Liu, Haixin1; Dror, Shahar1; Lin, Xinlei1; O'Neil, Keelin1; Arroyo, Oscar M.1; Hong, Sun Woo1; Kim, Hannah1; Liu, Jeffrey1; Blumenstock, Sonja1; Lim, Byungkook1; Mishne, Gal1 ; Komiyama, Takaki 1

Published Dec 22, 2025 on Dryad. https://doi.org/10.5061/dryad.np5hqc07j

Data files

Dec 22, 2025 version files 5.19 GB

Abstract

Coordinated motor behavior emerges from information flow across brain regions. How long-range inputs influence cell-type-specific activity within motor circuits remains unclear. The dorsolateral striatum (DLS) contains direct- and indirect-pathway medium spiny neurons (dMSNs and iMSNs) that exhibit distinct roles in movement control, and receives converging cortical and thalamic inputs. We performed 2-photon imaging from dMSNs, iMSNs, and their cortical and thalamic inputs identified by monosynaptic rabies tracing, as mice executed a skilled locomotion task. We used recurrent neural network (RNN) classifiers and hierarchical clustering analyses to reveal functionally heterogeneous subpopulations in each population. We found that dMSNs were preferentially active at movement onset and offset, and iMSNs during execution. Cortical and thalamic inputs were preferentially active during onset/offset and execution, respectively. dMSN- and iMSN-projecting neurons in each region showed similar trial-averaged activity patterns, although single-trial features might contribute to cell-type-specific differences. Furthermore, a subset of thalamic neurons projecting to dMSNs encoded rhythmic limb movements in a locomotion phase-specific manner, a pattern also found in a small subset of dMSNs. Inactivation of either cortex or thalamus substantially reduced MSN activity. These results suggest that corticostriatal and thalamostriatal inputs contribute complementary motor-related information via shared and cell-type-specific pathways.

Dryad DOI: https://doi.org/10.5061/dryad.np5hqc07j

This repository contains the full analysis pipeline for the paper:

Gjoni E., Sristi R. D., Liu H., Dror S., Lin X., O’Neil K., Arroyo O. M.,
Hong S. W., Kim H., Liu J., Blumenstock S., Lim B., Mishne G., & Komiyama T. (2025).
Complementary cortical and thalamic contributions to cell-type-specific striatal activity dynamics during movement.

The code reproduces Figures 1–4 from the manuscript.

Data Requirements

Before running any notebooks, download and unzip the following datasets:

Neural activity data
Unzip neural_activity_data.zip into data/
Paw-position (DeepLabCut) data
Unzip paw_position_data.zip into data/
Trained TEA-Net models
Unzip TrainedModels.zip into data/

Detailed descriptions of each dataset are provided below.

Repository Structure

data/
- neural_activity_data/
- paw_position_data/
- TrainedModels/
figure1_raw_data.ipynb
figure2_classifier_TEAnet.ipynb
figure3_clustering.ipynb
figure4_rhythmicity.ipynb
data_loader.py
utils/
models/

Neural Activity Data Format

Region and Cell-Type Naming Conventions

This repository uses two naming conventions.

Brain regions:

Paper terminology: DLS, M1, M2, PF
Code / filenames: Str, m1, Ctx, Tha

Mapping:

DLS → Str
M1 → m1
M2 → Ctx
PF → Tha

Cell types:

dMSNs (direct pathway neurons) → D1R
iMSNs (indirect pathway neurons) → A2A

All data files use the code naming convention.
The manuscript uses the paper naming convention.

Reproducing Figures

Figure 1: figure1_raw_data.ipynb
Figure 2: figure2_classifier_TEAnet.ipynb
Figure 3: figure3_clustering.ipynb
Figure 4: figure4_rhythmicity.ipynb

Detailed Data description

Contents of `neural_activity_data/`

For each region (Str, m1, Ctx, Tha) and each cell type (D1R, A2A), the folder contains four .npy files describing sampling frequency, behavioral alignment, neuron metadata, and neural activity matrices.

1. `avgFreq_{nType}_{region}.npy`

Example: avgFreq_D1R_Str.npy

Contains a single scalar value.
Represents the average sampling frequency (in Hz) at which neural activity was collected for that region and cell type.
Used to convert frame indices to time in seconds.

2. `GoCueOffset_{nType}_{region}.npy`

Example: GoCueOffset_A2A_m1.npy

Contains an array of integers.
Each value indicates the frame index of the Go Cue for that neuron within the extracted time window.
Provides a consistent behavioral alignment reference across neurons and trials.

3. `Stacked/allTrials_index_{nType}_{region}.npy`

Example: Stacked/allTrials_index_D1R_Ctx.npy

Contains a matrix where each row describes one neuron.
Columns correspond to:
- Animal_ID
- Session_ID
- Neuron_ID
The nth row corresponds exactly to the nth row in allTrials_stack for the same region and cell type.

4. `Stacked/allTrials_stack_{nType}_{region}.npy`

Example: Stacked/allTrials_stack_A2A_Tha.npy

Contains the neural activity matrix for all neurons across all trials for a given region and cell type.
Shape:
- (number_of_neurons × number_of_trials, number_of_time_frames)
Each row corresponds to the neuron described in the same row of allTrials_index.

The neural activity window spans approximately 19 seconds, consisting of:

5 seconds pre-ITI
1 second between Go Cue and ladder movement onset
8 seconds of ladder traversal
5 seconds post-ITI

Summary

Together, these four files provide a complete description of:

Sampling frequency
Behavioral alignment
Neuron metadata
Neural activity time series

Paw Position and Neural Activity Aligned Data for Figure 4 rhythmicity analysis

This repository includes paw position–aligned neural activity data used for rhythmicity and behavior–neural coupling analyses.

File Overview

df_neural_paw_{region}pca_direction_aligned_centroid{forepaw}.pickle
Contains paw position and corresponding neural activity data for valid trials.
df_movement_start_end_{region}.pickle
Contains trial-wise movement onset and offset information.

Movement File (df_movement_start_end_*)

Each row corresponds to a single trial.
Columns include:
- 0: Animal ID
- 1: Session ID
- 2: Recording date (DDMMYY format)
- 3: nType
- 4: Trial ID
- 5: Animal movement start time (relative to GoCue, where GoCue = 0)
- 6: Animal movement end time (relative to GoCue, where GoCue = 0)
This file specifies the start and end of animal movement for each trial.

Neural–Paw Aligned Data (df_neural_paw_*)

The neural–paw file is a dictionary with the following entries.

Neural Activity Data (df_neural)

Stored as a pandas DataFrame.
Columns represent:
- 0: Animal
- 1: session1d
- 2: Trial1d
- 3: NeuronId
- 4: Date and session number (DDMMYY_F{sess_num})
- 5: Neuron count in session
- 6: nType (0 = iMSN, 1 = dMSN)
- 7 onward: Neural activity time series

Paw Position Data (df_paw_left and df_paw_right)

Stored as pandas DataFrames.
df_paw_left and df_paw_right are identical in structure; only the paw-position values differ (left vs right forepaw).
Columns represent:
- 0: Animal
- 1: session1d
- 2: Date (DDMMYY)
- 3: D1R/A2A (string)
- 4: TrialID
- 5: boolean cell type (0 = iMSN, 1 = dMSN)
- 6 onward: Paw position time series
- session: Date and session number (DDMMYY_F{sess_num})

PCA Alignment

Paw position data consists of 2D trajectories (x and y coordinates) for each forepaw.
Principal Component Analysis (PCA) is applied to the paw trajectory to extract the dominant direction of movement.
The first principal component defines the movement direction used for alignment.
If forepaw = left, the PCA direction is computed from the left forepaw position.
If forepaw = right, the PCA direction is computed from the right forepaw position.
paw position is projected along this PCA-derived movement axis, ensuring that all trials are aligned to the dominant direction of limb movement.

Additional Metadata

start_data_idx_paw
Column index from which paw position data begins in the paw DataFrames.
start_data_idx
Column index from which neural activity data begins in the neural DataFrame.
time
Time vector relative to GoCue; length matches the neural and paw time series.

Temporal Alignment and Trial Selection

Neural and paw position data are aligned to the ladder movement epoch.
Trials with missing/unavailable paw positions are removed.

TrainedModels Directory

The TrainedModels/ directory contains all trained TEA-net models used for neuron type classification across brain regions and control conditions.

Folder Structure

Each subfolder follows the naming pattern:

{region}_shuffle_{boolean}

Where:

region ∈ {Str, m1, Ctx, Tha}
boolean ∈ {True, False}

The meaning of boolean is:

shuffle_False: Models trained on true labels (main results reported in the paper).
shuffle_True: Models trained under shuffle control, where neuron labels are randomly permuted to estimate chance-level performance.

Example folders:

Str_shuffle_False
Ctx_shuffle_True
Tha_shuffle_False

Model Files Inside Each Folder

Within each {region}_shuffle_{boolean} folder are multiple trained TEA-net models saved from different random initializations and cross-validation folds.

Each model file follows the naming convention:

{ML_model_name}_{region}_model_lr_{lr}_numsample_{num_samples}_{rand_init_iter}_{cv}_{batch_size}.model