Search Abstracts | Symposia | Slide Sessions | Poster Sessions
Humans vs. NLMs in Processing Wh-Filler-Gap Dependency and Backward Sluicing
Poster D25 in Poster Session D, Saturday, October 26, 10:30 am - 12:00 pm, Great Hall 4
Keonwoo Koo1, Wonil Chung1, Myung-Kwan Park1, Hyosik Kim1; 1Dongguk University
Seeing the remarkable performance of recent artificial neural language models (NLMs) in understanding and generating human language, this study shows that there are differences in language processing between humans and NLMs, thus limitations with the latter. Specifically, our experiment results indicate that human language processing is sensitive to the syntactic complexity of Wh-Filler-Gap Dependency (WhFGD) constructions in English, whereas NLMs are not. Previous research has shown that human processing of WhFGDs is influenced by the syntactic complexity within the dependency: configurations where a wh-filler is followed by a noun phrase (NP) and then a gap, i.e., wh-filler … NP … gap, are more difficult to process than configurations where a wh-filler is followed by a complementizer phrase (CP) and then a gap, i.e., wh-filler … CP … gap (Gibson and Warren 2004; Keine 2015; Kim 2023). We adopted these configurations to investigate whether pre-trained NLMs, such as GPT2-XL, GPT-NEO, and OPT, exhibit similar sensitivity to syntactic complexity during WhFGD processing. In Experiment 1, we used 24 WhFGD constructions, manipulating structural complexity (CP vs. NP) and construction type (WhFGD vs. sluicing) in a 2×2 factorial design. Sluicing, which also forms a WhFGD, was included to test the generality of WhFGD processing. We measured surprisal, the negative log-probability given the preceding context, which correlates with reading time measurements (Hale, 2001; Smith and Levy, 2013; Wilcox et al., 2020; Shain et al., 2022). A sum-contrast coded linear mixed-effects model with maximal convergence revealed that the NP construction was more difficult for humans to process (β=0.12, SE=0.03, t=3.37, p<0.001), while GPT2-XL found the CP condition more difficult than the NP condition (β = -0.89, SE = 0.46, t = -1.93, p = 0.05). Experiment 2 replicated the design of Experiment 1, except for the construction type levels: WhFGD vs. no-WhFGD. The no-WhFGD condition tested whether the results from Experiment 1 were specific to WhFGD processing. Linear mixed-effects models showed that humans found the NP condition harder to process than the CP condition (β=0.06, SE=0.02, t=2.35, p<0.05), particularly in the WhFGD context, resulting in a significant interaction effect (β=-0.11, SE=0.05, t=-2.19, p<0.05). However, all three NLMs consistently found the CP condition more difficult to process than the NP condition across both construction types (p < 0.05). The main difference between humans and NLMs in processing CP vs. NP structures in WhFGD constructions is their sensitivity to syntactic complexity. Humans find the NP structure more difficult due to the lack of an intermediate position (Spec-CP), which increases the cognitive load required to process the material between the wh-filler and the gap. This results in slower reading times and greater processing difficulty. Conversely, NLMs have greater difficulty with the CP condition because they cannot utilize the intermediate position provided by the CP structure and rely more on statistical patterns. This suggests that NLMs fail to effectively leverage the grammatical structure of CP, employing different strategies from humans when processing syntactically complex sentences.
Topic Areas: Syntax and Combinatorial Semantics, Computational Approaches