Presentation

Search Abstracts | Symposia | Slide Sessions | Poster Sessions | Lightning Talks

Bridging hierarchical speech processing with perceptuo-motor theory

Poster A77 in Poster Session A, Tuesday, October 24, 10:15 am - 12:00 pm CEST, Espace Vieux-Port
This poster is part of the Sandbox Series.

Nathan Trouvain1,2,3, Xavier Hinaut1,2,3; 1Centre Inria de l'Université de Bordeaux, 2LaBRI, CNRS UMR 5800, 3Institut des Maladies Neurodégénératives, CNRS UMR 5293

Perception and production of complex vocal gesture sequences is a typical trait of many animal species, from songbirds to humans. This communication system involves many processes at different scales: auditory perception and motor control of the vocal tract to sense and produce vocal gestures, continuous audio stream segmentation, and categorization into relevant language units (e.g. the syllables), and a hierarchical organization of the units sequences, used for both comprehension and production. There exists a vast literature covering several of these stages of speech and language processing. Several modeling works answer specific developmental, computational, and neurophysiological aspects of the language emergence problem, but the whole picture remains obscure. In particular, how and when processes happening at different timescales share information about current stimuli and context is unclear, and the type of information exchanged is largely unknown. Brain oscillations at different frequencies may mediate exchange between higher and lower order processes and would be modulated by both bottom-up and top-down information flows, tracking and segmenting relevant speech units in one case and relying on contextual information to disambiguate the stimuli in the other (Ghitza, 2011; Giraud & Poeppel, 2012; Poeppel & Assaneo, 2020, Nabé et al. 2021). Most models, however, present these effects from a purely perceptual perspective and embed them into a predictive coding framework that let aside the possible sensorimotor nature of the speech comprehension processes and the importance of speech production in that matter (Liberman & Mattingly, 1985; Schwartz et al. 2012; Moulin-Frier et al. 2015). We thus propose to extend models of speech segmentation and perception built upon brain oscillations (Nabé et al 2021; ten Oever and Martin, 2021) to explore different hypotheses on the nature of information flows in the speech and language hierarchy. We want to check the benefits of predictive coding in constructing relevant hierarchical language representations against a more sensorimotor approach, where brain oscillations would not just subserve perception but also production, and where relevant segmented representations would be constructed from speech using motor information. Moreover, we would like to explore to what extent these oscillation-mediated top-down and bottom-up processes may be chained to build deep hierarchical representations, in a similar way to Nabé et al. 2021, that could serve both perception and production. This model could then be used to make predictions about speech perception in noisy environments and provide an interesting framework to investigate developmental aspects of speech acquisition.

Topic Areas: Speech Perception, Computational Approaches

SNL Account Login

Forgot Password?
Create an Account

News