Skip to main content

Data from: The perception of caricatured emotion in voice

Cite this dataset

Whiting, Caroline M. et al. (2020). Data from: The perception of caricatured emotion in voice [Dataset]. Dryad.


Data for Whiting et al. (2020, Cognition): "The perception of caricatured emotion in voice". Raw behavioural data and sound stimuli.


Participants performed three behavioural tasks across three subsequent different-day sessions: paired dissimilarity ratings (sessions 1 and 2), followed by emotion ratings, arousal/valence ratings, and speeded emotion categorisation (session 3). In the dissimilarity ratings task, participants rated the perceived similarity of all (within-speaker) pairs of stimuli in the absence of instructions about any particular stimulus feature that would drive their ratings. On each of two sessions, participants rated the similarity between all stimuli from a given speaker (speaker order counterbalanced across participants). On each trial, they were presented with one of the possible 741 pairs of sounds, and were asked to rate how similar they were on a scale of “very similar” to “very dissimilar.” They could listen to the pair of stimuli as many times as necessary before giving a response. Participants were given 10 practice trials using a set of 10 vocalisations that were not included in the main experiment. The total duration of each session was approximately 120 minutes.

Participants performed the emotional ratings and speeded categorisation tasks in alternating blocks in the same (final) session. In the ratings task, participants rated each stimulus on arousal (low to high), valence (negative to positive), and emotional intensity for four emotions (anger/disgust/fear/pleasure, low to high). In all ratings tasks, data were coded on a scale of 0 (low) to 1 (high). In the categorisation task, participants were instructed to identify as quickly as possible the emotion expressed for each stimulus as being anger, disgust, fear, or pleasure. The association between a particular emotion and a particular response button was randomised for each block. Before the session began, participants were given 10 practice trials for both the rating and categorisation task. Participants were familiarised to the entire stimulus set before the first block for each speaker.

The emotion ratings and categorisation session consisted of 12 blocks, incorporating 6 blocks of each task (rating and categorisation). Each task was repeated three times for each of the two speakers (1 male, 1 female), and data were averaged across the three repetitions. One block contained all 39 stimuli for one speaker, and stimulus order was randomised for each participant. Speaker gender order and task order were pseudo-randomised for each participant, such that each gender and each task could only appear a maximum of twice in a row. The total duration of the session was approximately 120 minutes.

Usage notes

All information inside RawData_description.txt and Stimuli_description.txt


Biotechnology and Biological Sciences Research Council, Award: BB/M009742/1