Pronouns reactivate conceptual representations in human hippocampal neurons

Data files

Oct 01, 2024 version files 115.01 MB

Dijksterhuis-etal-2024-Science.zip
115.01 MB
DisciplineSpecificMetadata.json
70 B
README.md
5.98 KB

Abstract

During discourse comprehension, every new word adds to an evolving representation of meaning that accumulates over consecutive sentences and constrains the next words. To minimize repetition and utterance length, languages use pronouns, like the word ‘she’, to refer to nouns and phrases that were previously introduced. It has been suggested that language comprehension requires that pronouns activate the same neuronal representations as the nouns themselves. We recorded individual neurons in the human hippocampus during a reading task. Brain-imaging studies have gained insight into the brain regions that activate during sentence and discourse comprehension. However, the resolution of these methods does not suffice to track the neuronal assemblies that encode individual concepts in the human brain during reading. It has become possible to directly record the activity of single neurons in patients who are implanted with electrodes to locate the source of their epilepsy. These studies demonstrated the existence of ‘concept cells’ in the medial temporal lobe. Concept cells have an invariant and multimodal selective response to a concept. They contribute to the representation of meaning because they not only activate when the participant sees a picture of a specific individual for example, but also when the participant hears or reads the name of this person, or recalls this individual from memory. We hypothesized that monitoring the activity of concept cells during reading could provide insight into the dynamics of semantic representations during language comprehension. We found that cells that were selective to a particular noun were later reactivated by pronouns that refer to the cells’ preferred noun. These results imply that concept cells contribute to a rapid and dynamic semantic memory network that is recruited during language comprehension.

README: Pronouns reactivate conceptual representations in human hippocampal neurons

https://doi.org/10.5061/dryad.0zpc86768

This readme includes an introduction to the folders, the main script, and how to run it plus the meaning of the columns of the data matrices that are loaded in.

File names: Dijksterhuis-etal-2024-Science.zip and DisciplineSpecificMetadata.json

Code/Software

The scripts are written in MATLAB and are tested in versions R2020a, R2023a, and R2024a.

The main folder contains:

Reproduce_Figures.m
1. This is the script that loads all relevant data and reproduces the main figures of the manuscript.
Dependencies
1. NeuralynxMatlabImportExport_v411
see https://www.urut.ch/new/serendipity/index.php?/pages/nlxtomatlab.html)
2. Robust_Statistical_Toolbox-master

This is a toolbox needed for statistical tests (see https://github.com/CPernet/Robust_Statistical_Toolbox.git)
3. The other scripts in this folder are made by ourselves.

Data

ExampleCells
In the paper, we show data from a few example neurons. In this folder, we include the data from these neurons, so that the same figures as in the manuscript can be reproduced.
The meaning of the three data matrices is explained below.

SentenceTaskData.mat
- Population
- Every entry in this struct represents a multi or single unit recorded during the reading task. The units only come from the hippocampus and their maximum response minus their baseline is higher than 0.5Hz.
- Info = [patient number, session number, current electrode number, current channel number, cluster number, single (2) or multi-unit (1), index number into an all-channels vector]
- Spike4zeta = [n x 1] where n is the total number of spikes. Timestamps for all spikes of this unit The timestamps are in milliseconds relative to the start of the recording session.
- wordPSTH = [bins x words], PSTH per word over time. PSTH was constructed using a bin width of 0.01s. PSTHs were constructed from -0.5 to 2s relative to word onset.
- SpontRate = spontaneous spike rate.
- Nounraster = [nouns x bins] number of spikes per time bin (0.01s bins) for each noun.
- Noundets = [nouns x 1] which noun was shown, 1, 2, or 3? Number 1 and 2 were of the same sex and number 3 was the opposite sex.
- Pronounraster = [pronouns x bins] number of spikes per time bin for each pronoun.
- Pronounbeh = [pronouns x 1], did the pronoun come from an error (0) or success (1) trial?
- NounStats = this is the p-value from a Poisson-based permutation test (see materials and methods from the corresponding article).
- tuningM = the mean evoked response in Hertz in the window that gave max discriminability between the nouns, driven by the noun that gave the max response in that window
- TunedNoun = which noun was this unit tuned to?
- Noun_PostHoc = we used a GLM to test if the maximum response to the preferred Noun (coming from the time window that gave the maximum response) was higher than those of the other nouns. Noun_PostHoc is the p-value from glm.coefTest (tests that the coefficient is not zero).
- NoundetsTuned = [nouns x 1] here we recode the nouns into 1 (this is the preferred noun) or 0 (this is a non-preferred noun).
- Noun_TunedNotTuned_MeanPSTH = this is the difference in the mean PSTH between all preferred nouns vs. all non-preferred nouns.
- Noun_MeanPSTH_Tuned_Odd = we divided all noun responses into coming from trials with an odd or even number. This variable contains the mean PSTH overall preferred nouns vs. all non-preferred nouns for only the nouns coming from odd-numbered trials.
- pronoundetsShift = [pronouns x 1] after defining which noun is the preferred noun of the unit, we classify the pronouns as follows: 1 = the pronoun refers to the preferred noun 2 = the preferred noun is not present and the pronoun refers to a non-preferred noun 3 = the preferred noun is present, but not referred to by the pronoun 4 = both nouns are of the same sex and the pronoun can be referring to either or both of them (ambiguous condition).
- pronoundetsSex = [pronouns x 1] here we rename the pronouns into to which gender they refer to: 1 = the pronoun refers to the preferred noun 2 = the pronoun refers to the non-preferred noun with the same gender as the preferred noun 3 = the pronoun refers to the noun with the opposite gender 8 = ambiguous
- pronoundetsTuned = [pronouns x 1] here we rename the pronouns to whether they referred to the preferred noun or not: 1 = the pronoun refers to the preferred noun 0 = the pronoun refers to a non-preferred noun 8 = ambiguous
- Pronoun_Shift_MeanPSTH = [4 x bins] these are the mean PSTH per the condition of the pronoundetsShift variable.
- SentenceNoundets = [trial x 2] this variable gives the identity of each noun (the first and second one) per sentence/trial.
- SentencePronoundets = [trial x 2] the first column indicates to which noun identity (1, 2, or 3) the pronoun refers to and the second column states if this was the first or second noun in the sentence/trial.
- SentenceLength = gives the number of words per sentence.
- TuningPresent = [trial x1] states whether the preferred noun was present per sentence.
- SentenceBeh = [trial x 1] logical indicates whether the patient was incorrect (0) or correct (0) for each trial.
SentenceTrialInfo.mat
- perccorrect
- The rows represent the patient number
- The vector gives the mean percentage of correct trials per session that this patient did.
ConCell.mat
- concell
- this variable indicates per selected cell whether it’s a concept cell (1) or not (determined elsewhere, see methods)

Methods

We recorded electrophysiological data from micro-wires implanted chronically in the hippocampus of epilepsy patients during a reading task. In total, we recorded from 392 micro-wire electrodes located in the hippocampus during 49 sessions. The signal from the microwires was amplified using impedance-converting head-stages placed on the head of the patient (Neuralynx ‘HS-9’/Blackrock ‘Cabrio’) and it was either recorded with a 64-channel Neuralynx ATLAS system (32 kHz sampling rate) or with a 128-channel Blackrock NeuroPort Biopotential Signal Processing System (30 kHz sampling rate). Digital filters (Neuralynx: high-pass 0.1-1 Hz, low-pass 9000 Hz, Blackrock: high-pass 0.3, low-pass 7500 Hz) were applied after sampling. We used a semi-automated algorithm to determine whether to digitally re-reference the raw signal to a micro-wire from the same bundle. The algorithm attempted to minimise RMS noise levels while also avoiding re-referencing to a micro-wire that exhibited spiking. The choice of reference was made by the experimenter (D.D.). After re-referencing, the data from the screening and corresponding reading session were concatenated and spikes were detected and sorted using semi-automatic methods, as described previously. In short, the raw data was band pass filtered between 300 and 1500 Hz, and an automatic amplitude threshold was applied to detect threshold crossings (usually ~6 times the median absolute deviation of the filtered data time-series). The raw data was then re-filtered between 500-3000 Hz and spike waveforms were extracted around the threshold crossings. The spike waveforms were clustered using a wavelet transform and a previously described algorithm using WaveClus 3. Clusters were visually inspected by D.D. who merged, split, or excluded clusters depending on their waveform, signal-to-noise ratio, and ISI. We evaluated the quality of isolation of single units by computing the number of inter-spike intervals (ISIs) smaller than 3ms (ISI-violations), which was 0.34% (compared to 1.7% for multi-units) and the SNR, which was 16.6 on average (7.7 for multi-units). 307 hippocampal units showed a maximum response higher than their baseline and were included in the analysis and in this dataset.

Works referencing this dataset

Citation

Subject keywords

Funding

Human Brain Project : 650003
KNAW NWA : 240-846401, StartImpuls 2017
Dutch Research Council : 17619, Crossover 'INTENSE'
Ministry of Education Culture and Science : DBI2, Gravitation program
ERC : 101052963, Grant 'NUMEROUS'
ERC : 647954, Grant 'Code4Memory'
H2020 Research and Innovation : 899287, Programme 'NeuraViper'