Search Abstracts | Symposia | Slide Sessions | Poster Sessions
Investigating semantic representations with varying context during language comprehension
There is a Poster PDF for this presentation, but you must be a current member or registered to attend SNL 2024 to view it. Please go to your Account Home page to register.
Poster A43 in Poster Session A - Sandbox Series, Thursday, October 24, 10:00 - 11:30 am, Great Hall 4
This poster is part of the Sandbox Series.
Anuja Negi1,2, Fatma Deniz1,2; 1Institute of Software Engineering and Theoretical Computer Science, Technische Universität Berlin, Berlin, Germany, 2Bernstein Center for Computational Neuroscience Berlin, Berlin, Germany
Semantic representations are affected by the amount of context. Increasing the amount of context in the stimulus increases the representation of semantic information across the human cerebral cortex (Deniz & Tseng, et al. 2023). In their work, Deniz & Tseng et al. used static embeddings to capture the semantic properties of individual words. However, static embeddings do not account for different word senses or contexts. In this study, we extend their work by comprehensively comparing voxelwise encoding models based on both static and contextual embeddings. We used functional magnetic resonance imaging (fMRI) to record human brain responses. Each participant read words under four conditions that varied in context: narratives, isolated sentences, blocks of semantically similar words, and isolated words. Stimuli for all four conditions were generated from 11 spoken stories from The Moth Radio Hour (previously used by Huth et al., 2016). We then used a voxelwise encoding modeling (VM) approach to compare how different semantic models integrate contextual semantic information differently across the four conditions. We first extracted low-level linguistic embeddings, and several semantic embeddings (static and contextual) from the stimulus words in each condition separately. The low-level linguistic embeddings were phoneme count, number of words, number of letters, letters, and word length variation per TR. Traditional static vectors such as GloVe were used to get the static embeddings. Layer-by-layer representations from large language models such as BERT, GPT and Llama, were used to extract contextual semantic embeddings. Banded ridge regression (Nunez-Elizalde et al., 2019) was used to determine how each embedding is represented in each voxel (Wu et al. 2006, Naselaris et al. 2011). Prediction accuracy was quantified by calculating the Pearson correlation coefficient (r) between predicted and recorded BOLD responses. Separate datasets were used for model estimation and evaluation to estimate prediction accuracy. A separate voxelwise encoding model was fit for each voxel, participant, and stimulus condition. Our preliminary findings indicate that both static and contextual embeddings predicted brain responses more accurately when the stimulus included more context. A comparison between static and contextual embeddings suggests that the difference in prediction accuracy is more pronounced in stimuli with greater context.
Topic Areas: Meaning: Lexical Semantics,