Presentation

Search Abstracts | Symposia | Slide Sessions | Poster Sessions

Causal inference in discourse: N400 predicted by surprisal estimates from large language models

Poster C72 in Poster Session C, Friday, October 25, 4:30 - 6:00 pm, Great Hall 4

Xingyuan Zhao¹, Wenjun Ma¹, Seana Coulson¹; ¹UC San Diego

Prior ERP research on text and discourse comprehension has suggested that participants make inferences about unstated causes for events, as evidenced by discourse priming phenomena. In prior work, we recorded EEG from 16 healthy adults as they listened to vignettes requiring a causal inference, each followed by a visually presented probe in a cross-modal priming paradigm. Participants’ task was to answer yes/no comprehension questions about the vignettes and thus the probes were not task relevant. For example, the vignette “The farmers left the grapes out on a tarp. They shriveled into raisins in a few weeks,” was followed by a visually presented probe word that was either Causally Related (e.g., sun), Lexically Related to the final word in the vignette (e.g., months), or Unrelated (i.e., words that were either causally or lexically related to different vignettes in the stimulus set). Multiple stimulus lists were constructed so that each participant only saw one of the probe words after each vignette. Priming was indexed by the amplitude of the N400 ERP component elicited by probe words in each condition. Standard ERP analysis revealed a much larger N400 effect for the Causally Related probes than the Lexically Related probes (Relatedness x Probe Type, p < 0.01), suggesting greater priming for the unstated cause of the event in the vignette than for the lexically related word. Here we ask whether these discourse priming effects on the N400 component are explicable in terms of the surprisal of the probe words used in those paradigms as estimated by a variety of large language models. Surprisal is a measure of the unexpectedness of a word derived by taking the negative base-2 logarithm of its contextual probability. A quantification of a word’s information content, surprisal has previously been shown to correlate with the size of the N400 elicited by words in language contexts. Accordingly, we conducted single trial analyses to examine the relationship between N400 elicited by probe words in this study and their surprisal values as estimated by a series of autoregressive transformer language models with different model parameter sizes. N400 amplitude was operationalized as the mean amplitude voltage measured 300-500ms post-word onset at the central parietal electrode cluster. Six mixed effects models were constructed to predict N400 amplitude for each probe word as a function of its surprisal as estimated by gpt-3 Ada, Babbage, Curie, Davinci, Babbage-002, and Davinci-002. Each model included a fixed effect of surprisal and a random intercept for subject. A null (viz intercept-only) model included only a random intercept for subject and served as a baseline. Statistical model comparison involved computing a delta AIC score by subtracting each model’s Akaike Information Content (AIC) score from that of the null model. N400 amplitudes were well predicted by surprisal scores from the four largest language models ( 'Curie'ΔAIC = -17, 'Davinci'ΔAIC = -16, 'Babbage_002'ΔAIC = -17 and 'Davinci_002' ΔAIC = -27), suggesting discourse priming effects can be explained (in principle) as a side effect of a neural architecture that optimizes next-word prediction.

Topic Areas: Meaning: Discourse and Pragmatics, Computational Approaches

SNL Account Login

News

Abstract Submissions extended through June 10

Meeting Registration is Open

Make Your Hotel Reservations Now

2024 Membership is Open

Please see Dates & Deadlines for important upcoming dates.