Presentation

Search Abstracts | Symposia | Slide Sessions | Poster Sessions | Poster Slams

EEG reflects efficiency but not disengagement in artificial speech segmentation

Poster B12 in Poster Session B and Reception, Thursday, October 6, 6:30 - 8:30 pm EDT, Millennium Hall
This poster is part of the Sandbox Series.

Jino Chough1, Mai Miura1, Elizabeth Rosenthal1, Benjamin Zinszer1; 1Swarthmore College

Introduction. Segmentation is a well-established statistical learning paradigm with direct correspondences to natural language, but its application to multiple language contexts is less clear. Brief exposure to one artificial speech stream can prevent listeners from learning a second stream if they’re not provided explicit context cues (Gebhart et al., 2009). This primacy phenomenon has been explained by two hypotheses: Over-learning the first structure entrenches participants into a specific pattern that is increasingly difficult to modify (Bulgarelli & Weiss, 2016), or immediately after learning the first structure, participants disengage from the stimulus and fail to sample subsequent input (neural efficiency; Karuza et al., 2016). We analyzed continuous EEG data during an extended familiarization phase to one artificial language to assess whether changes over time in the cortical representation of auditory speech support entrenchment or efficiency accounts of primacy. Methods. Sixteen undergraduate students (mean age 19.4y) at Swarthmore College listened to 11:15 (m:s) of one speech stream (learnable after 5:30: Gebhart et al., 2009, Experiment 1b) while EEG data were recorded from 64 channels sampled at 250 Hz. Immediately afterwards, participants completed sixteen 2-AFC trials comparing words vs. part-words from the stream. The acoustic envelope of the speech stream (rectified hilbert transform of the auditory waveform) contained a 4.26 Hz signal corresponding to the syllable presentation rate. There was no analogous signal in the envelope for word presentation rate (~1.4 Hz), indicating that the learned structure was strictly statistical and not available from the acoustic signal. Within every 30 second interval, we estimated the spectral power of EEG data in the windows 4-4.5 Hz (syllables) and 1.3-1.5 Hz (words), signal coherence between the speech envelope and EEG data in the same windows, and cortical tracking of the broadband speech envelope using mTRF (Crosse et al., 2016). We linearly modeled changes in entrainment to syllable level frequencies and cortical tracking of the envelope as a function of time and behavioral performance. Results & Conclusion. Mean 2-AFC accuracy was 0.781 (SD 0.148). Ten of the participants showed significant individual-level learning (>75%, binomial p<0.05). Changes in EEG spectral power were not well explained by the linear models (all coefficients’ p>0.05). A model of signal coherence restricted to the second half of the familiarization phase contained a significant interaction between time and 2-AFC accuracy (p=0.003) which, under simple slopes analysis, indicated that participants with the highest behavioral performance increased coherence to the syllable-level frequency over time. The mTRF analysis, however, pointed to the opposite interaction (p=0.017); the highest-accuracy participants significantly decreased cortical tracking of the acoustic envelope over time. We speculate that participants who learn the language continue to engage with the syllable-level stimulus (coherence at 4.3 Hz), but that this sampling may require less overall effort (envelope tracking; see Zinszer et al., 2022). This explanation would nominally support efficiency but leaves unclear whether any savings in effort actually reduces sampling of distributional regularities from the speech stream.

Topic Areas: Multilingualism, Methods