Presentation
Search Abstracts | Symposia | Slide Sessions | Poster Sessions | Lightning Talks
Learning and application of speaker-specific semantic models
There is a Poster PDF for this presentation, but you must be a current member or registered to attend SNL 2023 to view it. Please go to your Account Home page to register.
Poster D19 in Poster Session D, Wednesday, October 25, 4:45 - 6:30 pm CEST, Espace Vieux-Port
Fabian Schneider1, Helen Blank1; 1University Medical Centre Hamburg-Eppendorf
We are better at understanding familiar than unfamiliar speakers. The mechanism underlying this familiar speaker benefit is still unclear. In this study, we combine computational models with behaviour to test whether listeners learn and apply speaker-specific semantic models to aid comprehension of incoming speech. To do this, we created a stimulus set of sixty vocoded auditory morphs between two words, each of which was semantically coherent with one of six semantic contexts derived from GloVe embeddings, yielding twenty words per context. Perception of these morphs as one word or the other was controlled in a validation experiment (N = 40). In the main experiment, participants (different N = 50) were shown faces of speakers that were matched with one of the six contexts, followed by binaural presentation of a morph and a two-alternative forced-choice between the two original words. Speaker-specific feedback was given such that one speaker could always be associated with one semantic context. Afterwards, participants were asked about their response strategy, i.e., whether they responded based on what they considered to be the correct answer or purely based on what they had heard. We expected that, should listeners acquire and apply speaker-specific semantic models, participants develop a bias towards reporting the option that was more coherent with the semantic space of the speaker. We found that only participants who reported having responded based on what they considered to be the correct answer for the current speaker, rather than what they had heard, showed a bias towards reporting the word that was more coherent with the general speaker-specific semantic context derived from GloVe. Using a free-energy approach, we computed time-resolved estimates of idiosyncratic (i.e., aligned with individual responses) and general (i.e., aligned with GloVe) speaker-specific semantic spaces that participants should have learned. Here, we found that all participants, including those who had given responses based on what they had heard, showed a strong bias towards reporting the word that was more coherent with their idiosyncratic speaker-specific semantic spaces. We verified this preference of idiosyncratic over general semantic spaces in embeddings obtained from GPT3 and BERT. Further, idiosyncratic speaker-specific semantic spaces gravitated towards those derived from GloVe over time, but convergence was slower in participants who reported to have given responses based on what they had heard. This delay in convergence disrupted the word-context associations derived from GloVe and explained why these participants showed a bias towards reporting the word that was more coherent with the idiosyncratic speaker-specific semantic spaces, but not those originally derived from GloVe. We conclude that humans learn speaker-specific semantic models that aid comprehension of incoming speech. These speaker-specific semantic models are idiosyncratic and may not correspond to general semantic spaces obtained from global word co-occurrence statistics. Therefore, our results contribute particularly to data-driven semantic approaches that have recently been popularised. Specifically, our results suggest that data-driven models, such as GloVe, GPT3 or BERT, do not sufficiently capture the high idiosyncracy of semantics and highlight the need for fine-tuning approaches.
Topic Areas: Meaning: Lexical Semantics, Computational Approaches