Search Abstracts | Symposia | Slide Sessions | Poster Sessions
A large and scalable semantic brain decoding framework as a precursor for artificial speech restoration in aphasia?
Poster A4 in Poster Session A - Sandbox Series, Thursday, October 24, 10:00 - 11:30 am, Great Hall 4
This poster is part of the Sandbox Series.
Andrew Anderson1, Bill Gross, Leo Fernandino, Jeffrey Binder, Hernan Rey; 1Medical College of Wisconsin
Aphasia is stroke-related language disorder that impacts approximately 1/250 people from 20 y/o+. In cases of severe stroke damage to middle cerebral artery territory, individuals remain intellectually intact, can understand language and even mentalize the meaning of things they’d like to say, but they cannot verbalize this as spoken, written or typed words. This is socially devastating, and because the prospects of verbal recovery are often limited, new interventions are needed. Fueled by advances in deep-learning, one possibility might be to develop Brain Computer Interfaces (BCI) to synthesize words by decoding meaning from undamaged brain regions during (attempted) language production. Indeed, following training on many hours of data recent BCIs that decode the semantics of inner speech from fMRI scans healthy adults now make this prospect seem approachable. However, for the grand goal of restoring conversational speech production in aphasia, fMRI cannot be a complete solution because fMRI is not portable and hemodynamic fMRI responses are delayed ~4s post neural firing. Invasive electrophysiological brain implants could potentially resolve both issues and would provide a way to accumulate the “personalized big brain data” that is probably essential to train decoding models. However, there is currently little hard evidence that practically effective semantic decoders can be built with current invasive technologies that unlike fMRI typically sample from small cortical regions. This leaves little to warrant the health risks associated with trialing semantic decoding implants in aphasia. With the overarching goal of discovering whether accurate semantic decoding from invasive electrodes is a possibility, we assert that: (1) Contemporary Encoder-Decoder artificial neural network models may already provide the requisite framework to accurately decode continuous electrophysiological brain recordings to words – if only there were the data to train them. (2) Data is key. Compiling a large, annotated dataset of high-quality signal recordings of brains in conversation with broad cortical coverage may be essential for training an accurate decoding model. (3) Pre-surgical evaluation for epilepsy could prove invaluable for accumulating such a big dataset. Here stereotactic electroencephalograms (sEEG) are recorded continuously for week-long periods, during which time bed-bound patients engage in many face-to-face conversations. We present the early steps of an effort to produce a standardized framework for acquiring and annotating and open-sourcing large scale conversational sEEG data. We consider challenges surrounding: (1) Speech recognition and speaker diarization, and the application of opensource models to transcribe audio data recorded in an sEEG hospital environment. (2) De-identifying people in conversational audio data. (3) Overcoming sparse individual sEEG electrode coverage (where 100 or so depth electrodes are positioned to spot the focus of epileptic seizures, often in semantic zones). (4) Developing a cross-participant decoding model, potentially as a centralized open access resource, for scalable update with cross-site data.
Topic Areas: Meaning: Lexical Semantics, Computational Approaches