Skip to main content

Data from: The music of silence. Part I: Responses to musical imagery encode melodic expectations and acoustics

Cite this dataset

Marion, Guilhem; Di Liberto, Giovanni; Shamma, Shihab (2021). Data from: The music of silence. Part I: Responses to musical imagery encode melodic expectations and acoustics [Dataset]. Dryad.


Musical imagery is the voluntary internal hearing of music in the mind without the need for physical action or external stimulation. Numerous studies have already revealed brain areas activated during imagery. However, it remains unclear to what extent imagined music responses preserve the detailed temporal dynamics of the acoustic stimulus envelope and, crucially, whether melodic expectations play any role in modulating responses to imagined music, as they prominently do during listening. These modulations are important as they reflect aspects of the human musical experience, such as its acquisition, engagement, and enjoyment. This study explored the nature of these modulations in imagined music based on EEG recordings from 21 professional musicians (6 females and 15 males). Regression analyses were conducted to demonstrate that imagined neural signals can be predicted accurately, similarly to the listening task, and were sufficiently robust to allow for accurate identification of the imagined musical piece from the EEG. In doing so, our results indicate that imagery and listening tasks elicited an overlapping but distinctive topography of neural responses to sound acoustics, which is in line with previous fMRI literature. Melodic expectation, however, evoked very similar frontal spatial activation in both conditions, suggesting that they are supported by the same underlying mechanisms. Finally, neural responses induced by imagery exhibited a specific transformation from the listening condition, which primarily included a relative delay and a polarity inversion of the response. This transformation demonstrates the top-down predictive nature of the expectation mechanisms arising during both listening and imagery.


  All participants were chosen to be very well-trained musicians and were all professionals or students at Conservatoire National Supérieur de Musique (CNSM) in Paris. They were given the musical score of the four stimuli in a one-page score and could practice on the piano for about 35 minutes. The experimenter checked their practice and verified that there were no mistakes in the execution. After practice, participants were asked to sing the four pieces in the booth with the tactile metronome, the sound was recorded in order to check their accuracy offline.

  The experiment consisted of a single session with 88 trials. For each condition (listening and imagery) each of the four melodies was repeated 11 times. The trial order was shuffled both in terms of musical pieces and conditions. In the listening condition, participants were asked to passively listen to the stimuli while reading the musical score. For the imagery condition, they were asked to imagine the melody in sync with the tactile metronome as precisely as they could.

    Four melodies from the corpus of Bach chorals were selected for this study (BWV 349, BWV 291, BWV354, BWV 271). All chorals use similar compositional principles: the composer takes a well-known melody from a Lutheran hymn (cantus firmus) and harmonizes three lower parts (alto, tenor, and bass) accompanying the initial melody on soprano, these cantus firmi were usually written during the Renaissance era. Our analysis only uses monophonic melodies, we therefore only use these cantus firmi as stimuli for our experiment, original keys were kept. The chosen melodies follow the same grammatical structures and show very similar melodic and rhythmic patterns. Participants were asked to listen to and imagine these stimuli at 100 bpm (about 30 seconds each). The audio versions were synthesized using a Fender Rhodes simulation software (Neo-Soul Keys). The onset times and pitch values of the notes were extracted from the midi files that were precisely aligned with the audio versions presented during the experiment.

The pre-processing has been made with filtering between 0.1Hz and 30Hz using Butterworth zero-phase filters and down-sampled to 64 Hz. Data were re-referenced to the average of all 64 channels and bad channels were removed and interpolated using spherical spline interpolation. The stimuli onsets that were closer than 500ms from a metronome beat were removes.

Usage notes

The unique data file contains 3 variables:     
    - downFs: the sampling rate
    - eeg: the EEG data as a cell-array (21*2*44*(1803*64)) -> (participants*conditions*trials*(time*electrodes)), with conditions as (listening, imagery)
    - stim: the time-representation of the expectation signal (onsets can be extracted from it) following the exact same structure as the EEG ((21*2*44*(1803*1)))

We also include a folder original_stim containing the audio presented during the experiment (left channel is the metronome), the midi files, and the pdf scores.


European Research Council, Award: 787836

Agence Nationale de la Recherche, Award: ANR-17-EURE-0017

Agence Nationale de la Recherche, Award: ANR-10-IDEX-0001-02