Presentation

Search Abstracts | Symposia | Slide Sessions | Poster Sessions | Lightning Talks

Examining phoneme, syllable, and word level representations in continuous speech processing

Poster C87 in Poster Session C, Wednesday, October 25, 10:15 am - 12:00 pm CEST, Espace Vieux-Port

Anne Marie Crinnion1, Christian Brodbeck1; 1University of Connecticut

How listeners process acoustic input and access meaning from that input remains a central question in the field of speech perception. Hierarchical models of language posit that representations are tracked across different linguistic levels, with distinct processing at different timescales (i.e., linguistic units: phonemes, syllables, words, etc.; e.g., Kuperberg & Jaeger, 2016; Martin, 2016). Recent work looking at processing of continuous speech has focused on phoneme-level information processing through surprisal effects, reflecting neural sensitivity to how unlikely a given phoneme is in context (e.g., Brodbeck et al., 2022; Heilbron et al., 2022). Work from our lab, however, has shown that listeners update representations at phoneme as well as word sized timescales (Crinnion & Brodbeck, 2022). Observing distinct word and phoneme updates implies multiple linguistic levels of processing, supporting hierarchical models. Here we aim to better understand the timescales at which the brain updates representations of speech. Specifically, we ask (1) whether we find evidence for distinct phoneme, syllable, and word representations and (2) whether representations at each timescale reflect processing of broader language context, context-independent lexical properties, or both. We focus on syllables because some hierarchical models of speech emphasize syllable-level processing (e.g. Hickok & Poeppel, 2007), and there is contention around whether speech perception uses phonemes, syllables, or both as information units (e.g., Hickok, 2014; Kazanina et al., 2018). We hope to add new evidence to this debate by using an information theoretic approach to understand the timescales at which the brain updates contextual representations. In order to answer these questions, we use MEG data from Brodbeck et al. (2022) where participants listened to continuous speech (an audiobook). We used an mTRF approach to model incremental speech processing using acoustic and linguistic predictors. Of interest were entropy and surprisal predictors calculated from a lexical 5-gram model. These information theoretic predictors used phonemes, syllables, and words as units. We find evidence for distinct updates at word and phoneme timescales, even when controlling for syllable-level representations, but crucially, we do not find evidence for distinct updates at the syllable timescale. Furthermore, using a Bayes Factor analysis, we find evidence against syllable-level representations when controlling for phoneme-level processing. Additionally, as has been previously shown by Brodbeck et al., (2022) for phonemes, we find evidence that word-level updates reflect local (lexical frequency based surprisal) and global (context constrained entropy and surprisal) processing. These results suggest a partially hierarchical model in which representations are updated continuously on multiple timescales. Evidence for two distinct levels of representations potentially suggests that phoneme-level updates reflect processes of lexical access and word-level updates reflect semantic level integration. We do not, however, find that listeners track every linguistic level, as we find no evidence for updates at the syllable timescale. Using information theoretic measures and controlling for updates at multiple timescales provides a more comprehensive approach to understanding the timescales (and arguably, linguistic levels) involved in continuous speech processing.

Topic Areas: Speech Perception,

SNL Account Login

Forgot Password?
Create an Account

News