Presentation
Search Abstracts | Symposia | Slide Sessions | Poster Sessions | Lightning Talks
Why the GPT task of predicting the next word does not suffice to describe human language production: A conversational fMRI-study
There is a Poster PDF for this presentation, but you must be a current member or registered to attend SNL 2023 to view it. Please go to your Account Home page to register.
Poster A1 in Poster Session A, Tuesday, October 24, 10:15 am - 12:00 pm CEST, Espace Vieux-Port
Caroline Arvidsson1, Johanna Sundström1, Julia Uddén1; 1Stockholm University
Interest is surging around the ”next-word-predictability” task that allowed large language models to reach their current capacity. It is sometimes claimed that prediction is enough to model language production. We set out to study predictability in an interactive setting. The current fMRI study used the information-theoretic measure of surprisal – the negative log-probability of a word occurring given the preceding linguistic context, estimated by a pre-trained language model (GPT-2). Surprisal has been shown to correlate with bottom-up processing located in the bilateral middle and superior temporal gyri (MTG/STG) during narrative comprehension (Willems et al., 2016). Still, surprisal has never been used to investigate conversational comprehension or any kind of language production. We hypothesized that previous results on surprisal in narrative comprehension would be replicated with conversational comprehension and that next-word-predictability would not encompass language production processes. We utilized a publicly available fMRI dataset in which participants (N=24) engaged in unscripted conversations (12 min/participant) via an audio-video link with a confederate outside the scanner. The conversational events Production, Comprehension, and Silence were modeled in a whole-brain analysis. Two parametric modulations of production and comprehension were added: (1) log-transformed context-independent word frequency (control regressor) and (2) surprisal. Production-surprisal and Comprehension-surprisal were respectively contrasted against the implicit baseline. These contrasts were compared with the contrasts Production and Comprehension vs implicit baseline. If surprisal merely indexed part of the activity in the latter, broader contrasts, this provides a handle on production and comprehension processes beyond next-word-predictability. For surprisal in conversational production, we observed statistically significant clusters in the left inferior frontal gyrus (LIFG), the medial frontal gyrus, and the motor cortex. Importantly, Production vs implicit baseline showed bilateral STG activation while STG was not parametrically modulated by surprisal. Moreover, the bilateral MTG/STG were the only clusters active for Comprehension vs implicit baseline and they were also modulated by surprisal. For comprehension, we thus replicated the previous narrative comprehension study (Willems et al., 2016), showing that unpredictable words activate the bilateral MTG/STG also in conversational settings. Next-word-predictability is thus so far a good model for conversational comprehension. For production, however, the next-word-predictability task helped to hone in on what is sometimes considered core production machinery in LIFG. Several functional interpretations of the STG recruitment during production are possible (such as monitoring for speech errors), but the current results point in the direction of two important conclusions: (1) a functional division of the frontal and temporal cortices during production, where the frontal component is prediction-related, and (2) that language processing during production is more than prediction, at least at the word-level. We provide a functional handle on such extra-predictive processes.
Topic Areas: Language Production, Computational Approaches