Presentation
Search Abstracts | Symposia | Slide Sessions | Poster Sessions | Poster Slams
Decoding features of speech production using intracranial stereo-electroencephalography
Poster B58 in Poster Session B and Reception, Thursday, October 6, 6:30 - 8:30 pm EDT, Millennium Hall
Tessy M Thomas1,2, Aditya Singh1,2, Latané Bullock1,2, Nitin Tandon1,2,3; 1McGovern Medical School at UT Health Houston, 2Texas Institute for Restorative Neurotechnologies, University of Texas Health Science Center at Houston, 3Memorial Hermann Hospital, Texas Medical Center
The speech production network is distributed across a wide expanse of lateral frontal and temporal cortical areas in the brain. A lesion in one or more locations of this network can result in the inability to produce speech, as seen with aphasic patients. Brain-computer interfaces (BCI) continue to be widely studied as a tool for restoring speech. However, most speech BCIs have been limited to use neural activity from only the speech motor cortex, which may not be adequate for the high variability in location and spread of lesions among the aphasic patient population. BCIs using stereo-electroencephalography (sEEG) have the potential to provide widespread coverage spanning multiple regions within the speech production network. However, less is known about the speech decoding potential of sEEG. We recorded neural population activity from 7 subjects with distributed sEEG electrode coverage while they read sentences aloud. These sentences were divided into individual speech components (phonemes, place of articulation, and manner of articulation), and the neural recordings were annotated with the onset and offset of each of these components. Using linear discriminant analysis, we built a classification model to decode each speech component from the broadband high-gamma power (70-150 Hz) of the neural activity. Each classifier was evaluated using 5-fold nested cross-validation, where 80% of the data was used to train and optimize the model parameters, and the remaining 20% was used to compute the model performance. The average classification accuracy across all subjects was significantly above chance for phonemes (5.4% across 38 phonemes), place of articulation (18.1% across 9 labels), and manner of articulation (26.5% across 5 labels). One subject repeated the task three times, resulting in the highest number of component labels for model training and subsequently the highest accuracies. From this longest dataset, we achieved an accuracy of 8.7% for phonemes, 26.9% for place of articulation, and 34.2% for manner of articulation. Furthermore, we also classified 36 words with an accuracy of 12.3% from this dataset. Across all subjects, the electrodes that contributed the most discriminatory information to the classifiers were located in multiple distributed cortical sites, including sensorimotor cortex, inferior frontal gyrus, mid-fusiform cortex, and auditory cortex. While some of these electrodes were close to the cortical surface, we also observed many contributing electrodes deeper in the gyri and sulci regions distributed across the dominant and non-dominant language hemispheres. These results demonstrate that decoding components of speech production involves contributions from multiple cortical regions. A distributed coverage can capture neural correlates of multiple speech components simultaneously, providing more information to build a diverse vocabulary of words for a speech BCI user. While this widespread coverage is not easily attainable by subdural electrode grids commonly used for speech BCIs, intracranial depth electrodes provide a safer alternative for accessing multiple areas across the brain.
Topic Areas: Speech Motor Control, Language Production