Search Abstracts | Symposia | Slide Sessions | Poster Sessions
Changes in pupil size track the syllabic rhythm of natural speech
Poster D42 in Poster Session D, Saturday, October 26, 10:30 am - 12:00 pm, Great Hall 4
Sebastian Sauppe1, Giorgia Sironi1, Chantal Oderbolz1, Catalina Torres1, Martin Meyer1; 1University of Zurich
Neural tracking — the systematic cortical response to the low-frequency, quasi-rhythmic structure of spoken language — is a ubiquitous phenomenon that supports the segmentation and comprehension of speech and is traditionally measured through electrophysiological methods. Since the 1960s, changes in pupil diameter have been used as a psychophysiological tool to study cognition and attention. Recently, it has been shown that pupil dilation is sensitive to the rhythm of syllable streams in statistical learning tasks and that the correlation between pupil size and neural activity increases during listening to narratives. We present the first systematic demonstration of the sensitivity of pupil dilation to the low-frequency characteristics of natural speech by testing whether there is a consistent phase relationship between the speech envelope and the size of listeners’ pupils. Additionally, we tested whether pupillary speech tracking is sensitive to variations in prosody by manipulating the acoustic profile of auditory stimuli. Prosodic information, especially syllable rate, accounts for the most prominent modulations in the speech temporal modulation spectrum and drives neural tracking. Participants (N = 28) listened to naturalistic stimuli, while pupil size was recorded with an eye tracker and cortical EEG activity was recorded (64 electrodes, both sampled at 1000 Hz). This co-registration allowed to assess the “trackability” of the stimuli by measuring an expected neural speech tracking response. Participants listened to excerpts from audio books (in total approx. 30 minutes) and answered comprehension questions, to encourage paying close attention. Half of the excerpts were acoustically manipulated by resynthesizing vowel and pause durations and fundamental frequency to reduce the acoustic strength of boundaries and prominences. Since previous studies have found sizable individual differences in tracking abilities, participants were additionally classified as high or low synchronizers, based on the Speech-to-Speech Synchronization task. To determine the degree of speech tracking, the phase-locking value between pupil size and EEG activity and the gammatone-bank filtered speech envelope was calculated in a frequency band of 3-4.5 Hz (representing the range of syllabic rates in the audio book excerpts) and in time bins of 5 seconds. Statistical significance was assessed through mixed-effects beta regression and regression trees (recursively partitioning the data to find subgroups, thus allowing to model the spatial structure of EEG). Phase-locking between neural activity and the speech envelope showed a wide-spread topography and was decreased by both the prosodic manipulation and in low synchronizers (all ps < 0.001). Pupil size was also phase-locked with the speech envelope (p < 0.001), but no effects of the prosodic manipulation or participants’ synchronizer status were observed. This decreased sensitivity could be explained by the lower dimensionality of the pupil data (1 vs. 64 channels). Our study shows that pupil size changes track at least the syllabic structure of speech. As pupil size changes reflect sensory tuning and attentional states, driven by activity in the locus coeruleus, these results open a window into the role of the brainstem in language processing and provide a prospective psychophysiological tool for studying speech tracking (e.g., in populations not easily examined with EEG).
Topic Areas: Prosody, Speech Perception