Presentation

Search Abstracts | Symposia | Slide Sessions | Poster Sessions | Lightning Talks

Mitigating catastrophic interference in neural network models of bilingual lexical acquisition

There is a Poster PDF for this presentation, but you must be a current member or registered to attend SNL 2023 to view it. Please go to your Account Home page to register.

Poster A89 in Poster Session A, Tuesday, October 24, 10:15 am - 12:00 pm CEST, Espace Vieux-Port

James Magnuson1, Marian Simarro González2, Thomas Hannagan3, Nils Beck4, Manuel Carreiras2; 1BCBL: Basque Center on Cognition, Brain and Language; Ikerbasque; University of Connecticut, 2BCBL: Basque Center on Cognition, Brain and Language, 3University of Connecticut, 4University of Stuttgart

Catastrophic interference is a classic dilemma for connectionist models. If a neural network is initially trained on one input-output mapping (e.g., mapping semantics to English/EN phonology) and subsequently trained on another (e.g., mapping semantics to French/FR phonology), the second mapping is likely to virtually overwrite the first -- hence the label "catastrophic interference". On the other hand, savings in relearning are often observed if the network is retrained on the first system, showing that not all knowledge has been lost. McClelland, McNaughton and O'Reilly (1995; see also Kumaran, Hassabis & McClelland, 2016) proposed a division of labor between neocortical and hippocampal pathways that could provide a solution. In Complementary Learning Systems (CLS) theory, long-term memories are stored in the cortex, while new learning relies heavily on hippocampal pathways. New learning is influenced by prior learning, but without disrupting prior learning significantly. Gradual consolidation allows new learning to integrate with long-term memory. CLS has not been applied in a detailed way to late second language acquisition. In the current project, we simulated learning of L1 (English/EN) and later acquisition of L2 (French/FR). Models were trained in four stages. (1) We trained an autoencoder network to map phonological inputs (with a slot based approach, replicating phoneme representations at every possible position, with inputs aligned to random starting positions) to identical (but non-shifted) phonological outputs via a hidden layer. (2) We trained the network to map semantic inputs to phonological outputs via the same hidden layer. We mitigated moderate phono-phono mapping interference with regular interleaving of phono-phono and semantics-phonology training. (3) Once the network could stably perform the EN phono-phono and semantics-phonology mappings, we began training on FR phonological autoencoding. (4) Finally, we trained the model to map from FR semantics (which only differed from EN by addition of grammatical gender) to FR phonology. Unsurprisingly, each new mapping caused substantial interference, with the FR semantics-phonology mapping causing massive interference for EN semantics-phonology. Clearly, it is implausible for L2 to wipe out L1 in humans. Therefore, we explored ways that we might mitigate this strong interference. Specifically, we reserved portions of weights in particular layers for FR training. When we began FR training, we allowed activity to flow via connections trained on EN, but we did not alter these weights during L2 training. As we interleaved training on EN and FR, we alternated between disabling learning for each language's 'dedicated' weights. This coarse analog of CLS is meant only to demonstrate feasibility for planned work to formally implement CLS. This approach appears feasible for late-acquired L2, although results are sensitive to where weights are reserved. Reserving weights in phono-hidden or hidden-output layers has only weak protective effects; we still observe massive interference from FR semantics-phonology, especially on EN semantics-phonology. However, reserving weights in the semantics-hidden pathway robustly mitigates catastrophic interference. We will discuss strong assumptions in the current approach (e.g., that L1 should be active during L2 training and vice-versa) as well as a proposal for implementing CLS formally for this domain.

Topic Areas: Computational Approaches, Language Development/Acquisition

SNL Account Login

Forgot Password?
Create an Account

News