Skip to main content
Dryad

Data from: Intermediate acoustic-to-semantic representations link behavioural and neural responses to natural sounds

Data files

Feb 13, 2023 version files 171.90 GB

Abstract

Recognizing sounds implicates the cerebral transformation of input waveforms into semantic representations. Although past research identified the superior temporal gyrus (STG) as a crucial cortical region, the computational fingerprint of these cerebral transformations remains poorly characterized. Here, we exploit a model-comparison framework and contrasted the ability of acoustic, semantic (continuous and categorical), and sound-to-event deep neural network (DNN) representation models to predict perceived sound dissimilarity and 7 Tesla human auditory cortex fMRI responses. We confirm that spectrotemporal modulations predict early auditory cortex (Heschl’s gyrus) responses, and that auditory dimensions (e.g., loudness, periodicity) predict STG responses and perceived dissimilarity. Sound-to-event DNNs predict HG responses similar to acoustic models but, notably, they outperform all competing models at predicting both STG responses and perceived dissimilarity. Our findings indicate that STG entails intermediate acoustic-to-semantic sound representations that neither acoustic nor semantic models can account for. These representations are compositional in nature and relevant to behaviour.