Data from: Slow cortical dynamics generate context processing and novelty detection
Data files
Jun 25, 2025 version files 167.29 GB
-
AC_data_list.csv
22.20 KB
-
M1_im1-32_missmatch.zip
28.67 GB
-
M10_im1-19_missmatch.zip
1.32 GB
-
M2_im1-8_missmatch.zip
6.25 GB
-
M226_im1-9_echo.zip
12.78 GB
-
M3_im1-24_missmatch.zip
19.19 GB
-
M4_im1-8_missmatch.zip
6.73 GB
-
M4264_im1-8_echo.zip
3.99 GB
-
M4265_im1-4_echo.zip
6.23 GB
-
M4266_im1-5_echo.zip
7.10 GB
-
M4371_im1-4_echo.zip
5.79 GB
-
M4372_im1-4_echo.zip
6.86 GB
-
M5_im1-10_missmatch.zip
11.03 GB
-
M6_im1-15_missmatch.zip
5.60 GB
-
M7_im1-9_missmatch.zip
9.71 GB
-
M8_im1-19_missmatch.zip
15.04 GB
-
M9_im1-22_missmatch.zip
20.98 GB
-
README.md
6.31 KB
Apr 15, 2026 version files 199.39 GB
-
AC_data_list.csv
22.20 KB
-
M1_im1-32_missmatch.zip
26.73 GB
-
M2_im1-8_missmatch.zip
6.25 GB
-
M226_im1-9_echo.zip
12.78 GB
-
M3_im1-24_missmatch.zip
19.19 GB
-
M4_im1-8_missmatch.zip
3.42 GB
-
M4264_im1-8_echo.zip
3.99 GB
-
M4265_im1-4_echo.zip
6.23 GB
-
M4266_im1-5_echo.zip
7.10 GB
-
M4371_im1-4_echo.zip
5.79 GB
-
M4372_im1-4_echo.zip
6.86 GB
-
M5_im1-10_missmatch.zip
10.21 GB
-
M6_im1-15_missmatch.zip
8.75 GB
-
M7_im1-9_missmatch.zip
8.13 GB
-
M8_im1-19_missmatch.zip
14.71 GB
-
M9_im1-22_missmatch.zip
19.74 GB
-
README.md
9.10 KB
-
RNN_test_data_2024_5_24_9h_42m2_cont_data.zip
4.50 GB
-
RNN_test_data_2024_5_24_9h_42m2_ob_data.zip
34.91 GB
-
RNN_test_data_2024_5_24_9h_42m2_params.npy
97.55 MB
Abstract
The cortex amplifies responses to novel stimuli, compared to those elicited by redundant stimuli—a function key to efficiently processing sensory information and building predictive models of the environment. Novelty detection is measured by the “Mismatch Negativity” (MMN) signal, the reduction of which represents the best functional biomarker of schizophrenia. To better understand the circuit mechanisms of novelty detection, we used an auditory “oddball” paradigm and two-photon calcium imaging to measure responses to simple and complex stimuli in neuronal populations across the mouse auditory cortex. Stimulus statistics and complexity generated differences in neural response profiles across contexts and auditory cortical subregions. At the population level, neuronal ensembles separately and reliably encoded basic auditory features, as well as temporal context. Interestingly, stimuli-evoked responses were particularly long-lasting, persisting after the stimuli ended and affecting responses to future stimuli. These slow network dynamics encoded stimulus history and temporal context, generating novelty detection. Recurrent neural network models trained on the oddball task exhibited slow network dynamics and recapitulated the biological data, including context selectivity, MMN, and stimulus-specific adaptation. We conclude that the slow dynamics of recurrent cortical networks underlies temporal processing of stimuli, a canonical computation that gives rise to context-specific encoding and novelty detection.
https://doi.org/10.5061/dryad.xsj3tx9q6
Description of the data and file structure
Calcium imaging data:
Each .zip file contains all datasets from a single mouse. Mice M1 - M9 ("missmatch") are oddball data that were used for Figures 1, 2, 3A-3F. Mice M226, M4264, M4265, M4266, M4371, M4372 are variable ISI experiments ("echo") used for Figures 3 H-3J.
Each "M" file contains all the datasets collected from the given mouse. Each dataset contains processed calcium imaging data within files ending with "_results_cnmf_sort.mat", along with stimulus and imaging information within files ending with "_processed_data.mat". Finally, "AC_data_list.csv" file contains additional information about datasets and mice.
Extended details about the experimental methods and analysis are described in detail in the attached publication.
"**_results_cnmfsort.mat*": files containing processed calcium imaging data. Data was motion corrected with modified Suite2P motion correction, and single neuron traces were demixed using the CaImAn pipeline. After demixing, individual cells were selected using signal-to-noise (SNR) metrics with a custom code (caiman_sorter). Below listed are key variables needed for data analysis
- est: raw CaImAn outputs (documentation https://caiman.readthedocs.io/en/latest/)
- A: neuronal spatial components corresponding to regions of interest (ROIs) in calcium imaging data
- C: neuronal temporal components
- YrA: temporal residual components. Raw calcium imaging traces are reconstructed with C + YrA
- b: spatial background components
- f: temporal background components
- proc: additional parameters after sorting caiman outputs from est via the caiman_sorter GUI
- idx_components: list of indexes containing the selected components from est.A, est.C, est.YrA, etc.
- idx_components_bad: list of discarded components
- SNR2_vals: computed SNR values with a more stable algorithms that one in CaImAn, and used for selection
- deconv: contains deconvolved traces for all CaImAn extracted components using one of the three algorthms (smooth_dfdt (default), foopsi, MCMC)
- smooth_dfdt: simple algorithm that uses smoothed, rectified first derivative to infer the neuronal firing rate from the raw caclium imaging traces
- S: trace of deconvolved firing rate proxy data
- smooth_dfdt: simple algorithm that uses smoothed, rectified first derivative to infer the neuronal firing rate from the raw caclium imaging traces
- ops: contains parameters used for CaImAn and postprocessing
- init_params_caiman: contains CaImAm parameters
- eval_params2*:* parameters used for cell selection in caiman_sorter GUI
- deconv: contains deconvolution parameters
- smooth_dfdt.params.gauss_kernel_sigma: standard deviation of gaussian smoothing kernel during deconvolution
"**_processed_data.mat*": files containing input stimulus information, imaging parameters, and recorded behavior (locomotion).
- data: contains stimulus information data
- frame_data: imaging parameters
- volume_period: volume imaging period for each dataset (number of planes*frame period)
- frame_period: frame imaging period
- frame_times_mpl: times of each frame, for both single plane or multiplane data
- stim_params: contains data on oddball parameters
- MMN_freq: index of frequencies used for oddball
- stim_duration: stimulus duration
- isi: interstimulus interval duration
- trials: number of trials in oddball
- start_freq: in Hz, the first and lowest frequency
- increase_factor: octave difference between frequencies
- num_freqs: number of total frequencies
- stim_times_frame*: the times of each stimulus in terms of frame index*
- stim_times_volt: the times of each stimulus in terms of ms
- volt_data_binned: voltage data recorded with microscope containing raw stimulus times, locomotion, etc.
- frame_data: imaging parameters
- ops: parameters used for processing stimulus information data
"**_registration_cmnf.mat*": files containing spatial registration information to map the same cells recorded across datasets from the same field of view (FOV). These were not used in any of the analysis published, but are included here nevertheless. The FOV number for each dataset is listed in the "AC_data_list.csv" file, in the "FOV" column.
- A_list: cell containing all input spatial footprints "A" from corresponding datasets.
- fname_list: list of dataset names for the corresponding FOV
- reg_out: cell containing the outputs of registration
- idx1: combined "A" file across datasets where all unique spatial footprints have unique indexes.
- idx2: array containing registration information, where each row is a unique footprint, with its contents containing the indexes of those same footprints from the input datasets.
"AC_data_list.csv": file contains additional information about datasets and mice. "n/a" values in the "AC_data_list.csv" file correspond to "data not available" because it was not recorded.
All calcium imaging files are in ".mat" format and can be loaded either in matlab or python.
- In matlab use "load()" functoin.
- In python use "h5py" library, with "h5py.File()" function.
- In python the RNN calcium imaging datasets can also be loaded using a custom "f_load_caim_data_mat()" function from the "slow_dynamics_protocol" library attached below.
RNN data:
Two types of RNN test data in provided in the repository, RNNs tested with oddball inputs "RNN_test_data_2024_5_24_9h_42m_ob_data.npy" and RNNs tested with control frequency inputs "RNN_test_data_2024_5_24_9h_42m_cont_data.npy". Each dataset outputs from 3 types of RNNs, ones originally trained on oddball task, trained on control frequency discrimination task, and untrained. The training parameters and test input data is provided in the "RNN_test_data_2024_5_24_9h_42m_params.npy" dataset. The RNN test data contains lists for dictionaries each corresponding to training on separate network. Each dataset dictionary contains the rates on RNN units, under the "rates" key.
- "_cont_data.npy" contains a list of RNN outputs tested with control inputs.
- list of dictionaries each corresponding to outputs from a separate RNN
- "rates": contains unit activity during test that corresponds to neuron firing
- "input": the inputs used for testing of RNN
- "target": the target used to estimate RNN performance
- "loss": used to estimate test performance
- list of dictionaries each corresponding to outputs from a separate RNN
- "_ob_data.npy" contains a list of RNN outputs tested with oddball inputs.
- same format as for "cont_data.npy" above.
- "**_params.npy" dictionary file contains training parameters and information.
- "params_all": parameters used for training of RNN.
- "rnn_leg": legend containing the labels of RNNs ("oddball trained", "control trained", "untrained").
- "params_test": parameters used during testing of RNN.
- "ob_data": Inputs used during testing of oddball tested networks.
- "cont_data" Inputs used during testing of control tested networks.
All RNN datasets are saved as ".npy" format and contain the test outputs of RNN unit activity
- In python RNN files can be opened using the numpy library, specifically the "numpy.load()" function.
- In python RNN files can also be opened using a custom "f_load_rnn_test()" function from the "slow_dynamics_protocol" library attached below.
"**.npy": The (recurrent neural network) RNN test outputs are saved as python .npy format. These contain the RNN unit activity
Other:
Extended details on experiments and analysis are provided in the attached manuscript, and accompanying code in the GitHub libraries linked below.
Code/software
Slow dynamics analysis pipeline (A pipeline for loading the calcium imaging and (recurrent neural network) RNN test data for reproducing the figures in the STAR Protocols paper):
https://github.com/shymkivy/slow_dynamics_protocol
Widefield mapping analysis:
https://github.com/shymkivy/AC_mapping_analysis
Caiman_sorter cell selection GUI :
https://github.com/shymkivy/caiman_sorter
Motion correction:
https://github.com/shymkivy/motion_corr_YS
Two photon data analysis:
https://github.com/shymkivy/AC_2p_analysis
Spatial registration of 2p datasets to widefield
https://github.com/shymkivy/register_2p_to_wf
RNN model:
https://github.com/shymkivy/RNN
External pipelines:
Methods of data collection, preprocessing, and analysis are described in detail in the associated article
Data was collected with two-photon calcium imaging in the Auditory Cortex of awake mice that were head fixed and moving on a circular treadmil. Mice were presented with auditory oddball stimuli, and many standards control variation. Raw data were motion corrected, and regions of interest (ROIs) corresponding to cells were demixed with CaImAn. The demixed data is provided here.
Changes after Jun 25, 2025:
Added data generated by recurrent neural networks (RNN) that is included in this article (Figure 4).
The added recurrent neural network (RNN) data is part of the published study where we used RNNs to model the activity seen in the mouse data. The RNNs were trained to do the same tasks as the mice, and then we tested their performance. We then analyzed RNN unit activity in the same way we did with mice, and added the data as part of figure 4 in the original study. The reason we are adding this data now is because we submitted a "STAR protocol" paper for publication which is an extension to the original study and uses the RNN data extensively. The doi for second paper is not available yet, but we will add it in the future.
Removed M10 dataset because the data quality was low and needed to fit into the 200Gb data size limit.
