Presentation
Search Abstracts | Symposia | Slide Sessions | Poster Sessions | Lightning Talks
Using deep neural networks and adaptive stimulus presentation for automated investigation of natural sound representations in auditory cortex
There is a Poster PDF for this presentation, but you must be a current member or registered to attend SNL 2023 to view it. Please go to your Account Home page to register.
Poster E117 in Poster Session E, Thursday, October 26, 10:15 am - 12:00 pm CEST, Espace Vieux-Port
Kyle Rupp1, Pedro Da Costa2,3, Fred Dick4, Robert Leech3, Taylor Abel1; 1University of Pittsburgh, 2Birkbeck, University of London, 3King's College London, 4University College London
Humans can efficiently and rapidly digest auditory scenes, segregating and sorting a single stream of superimposed sounds into separate categories. However, the neural processes underlying this transformation from acoustic signals to semantic categories remain poorly understood. To investigate this, researchers often record neural responses while subjects listen to a stimulus set of natural sounds and then build encoding models to relate neural responses to stimulus features, such as spectrotemporal measures and category-level representations. One intriguing approach involves using deep neural networks (DNNs) trained to categorize sounds; if an encoding model that uses DNN hidden layer activations as input features (i.e., a DNN-derived encoding model) can predict neural responses accurately, this suggests similar representations between the machine learning model and human auditory cortex. A cortical site that is most accurately predicted by a shallow DNN layer would suggest low-level acoustic representations, while deeper layers correspond to abstract category-level representations. In spite of the success of these approaches, one major limitation involves the risk of undersampling the stimulus space when using a fixed set of stimuli. To address this, we propose a two-stage approach: in stage one (S1), we collect neural responses to a standard natural sounds stimulus set using stereoelectroencephalography (sEEG) and build DNN-derived encoding models across all DNN layers. For a given channel, the most accurate model is then applied to a very large (hundreds of thousands) stimulus library, producing a stimulus space of predicted neural responses across this library. During stage two (S2), Bayesian optimization is used to adaptively select and present stimuli from this library, with neural responses analyzed in real time. By prioritizing exploration over exploitation, we can force the model to sample across the stimulus space and investigate regions of high uncertainty, e.g., where observed neural responses differ from those predicted by the S1 encoding model. Here we show a proof of concept for this approach in one subject. After completing S1, three auditory cortex channels were selected for S2. The first channel, located in left posteromedial Heschl’s gyrus, was revealed to encode features of the modulation power spectrum (MPS), namely fast temporal and low spectral modulations. An MPS encoding model built from S2 data performed better than S1, suggesting the S2 stimuli were better tuned to explore this feature encoding. The second channel, located in left superior temporal sulcus (STS), was speech-selective, with S2 providing additional corroborating stimuli. Lastly, a site in right STS provides the most compelling evidence for the utility of our approach. The S1 encoding model suggested speech selectivity, with large neural responses predicted for both singing and speech. However, S2 data revealed song selectivity, with observed responses to singing exceeding both speech and instrumental music. Without having to design and run separate experiments, our method automatically picked stimuli to elucidate MPS encoding, speech selectivity, and song selectivity in three separate channels. These preliminary results suggest that this is a powerful method to interrogate a channel’s feature encoding in an automated way.
Topic Areas: Methods, Speech Perception