Neurology 2024: Towards Brain-computer Interface-based Communication Tool Utilizing EEG Signals Integrated with Large Language Model
Speaker: Hyung Young Park
Brain-computer interfaces (BCIs) enable direct communication between the brain and external devices. BCIs have been applied to various applications, including robot-assisted systems and communication tools, by analysing and classifying brain activity signals through machine learning or deep learning. One promising application discussed was speech imagery decoding, where brainwaves from imagined speech were analysed without actual speaking. It was particularly useful for developing communication tools for individuals with speech impairments. Traditional speech imagery decoding had focused on brain activity related to a limited set of tasks, but recent efforts have expanded to measure language activity using techniques like Resonance Imaging (fMRI), Magnetoencephalography (MEG), and Electrocorticography (ECoG). This method, coupled with large language models, enabled phoneme and sentence-level decoding, pushing the boundaries of BCI applications.
In 2023, significant research advancements were reported in neuroprosthetics and BCIs. A study in Nature Neuroscience demonstrated semantic reconstruction of continuous language from non-invasive brain recordings using fMRI. Similarly, research in Computational Biology showcased real-time synthesis of imagined speech from minimally invasive neural recordings. These studies represented critical steps towards developing BCI tools for language communication directly decodes speech from recorded neural signals. The presentation highlighted additional studies in Nature discussing progress and challenges in neuroprosthetics, underscoring the field's evolving landscape.
The first study used ECoG to showcase neural processes for speech decoding and avatar control. It captured neural activity from attempted silent speech, decoded text, synthesized speech, and controlled an avatar to animate gestures of mice based on decoded speech. The second study also employed ECoG to decode phoneme-level speech, highlighting the correlation between neural signals and articulatory movements for accurate speech decoding. Both studies, despite their advancements, shared notable limitations. The use of invasive methods required surgical implantation, limiting broader applicability. Additionally, focusing on phoneme-level decoding restricted speech fluency and naturalness, underscoring the need for sentence-level semantic decoding solutions.
The research aimed to decode human speech intentions from non-invasive Electroencephalography (EEG) signals by generating sentences that matched the original intentions. This proposed system would respond appropriately based on these sentences, advancing towards an interactive neural AI system using BCI techniques. Addressing a significant research gap, the study focused on sentence-level decoding rather than just phoneme or word-level decoding. The goal was to develop a novel BCI communication tool by combining neural engineering with large language models like GPT.
The proposed system began with participant data collection while they thought about specific words or sentences, capturing their brain's electrical activity during the thought process. Next, Deep neural active learning techniques were applied to extract keywords and discern user intent from the collected EEG data. Sophisticated algorithms analysed brain activity patterns to identify these keywords and intentions behind the thought process. Sentences were generated using the extracted keywords, and user intent was identified to reflect the user's thoughts accurately. Integrating a large language model ensured that the sentences generated were coherent and contextually appropriate. Subsequently, the sentences served as prompts for the language model to produce appropriate responses, facilitating meaningful interaction with the user, providing relevant responses based on decoded thoughts. The framework's potential for application in specialized computing and other advanced systems was highlighted, demonstrating possibilities for integrating BCI technology into interactive systems to enhance user experience across various domains.
The EEG data collection process used an experimental paradigm to study neural correlates of speech and speech imagery. Participants repeated each sentence aloud five times and imagined speaking it 15 times. Each participant completed the experiment over at least three days to ensure robust data collection. This approach aimed to improve data reliability and model robustness through cross-session analysis. The study examined neural differences between actual and imagined speech through tasks of overt speech and speech imagery. In each session, Participants could create sentences tailored to specific contexts within predefined categories. Five sessions were conducted in one day.
Utilizing a 64-channel Brain Vision EEG system with an international 10-20 system channel montage, the research team conducted a pilot experiment with one subject to refine their EEG recording setup and process. Feedback from the pilot study was used to enhance the data collection procedure. The data collected underwent rigorous analysis to ensure quality and validity. Subsequently, based on the successful pilot results, the main experiment proceeded, aiming to involve a minimum of 30 participants in further investigations.
Visualization and comparative analysis using Tupper Plus were conducted to explore speech imagery based on data from the pilot experiment. They observed reduced brain activity during speech imagery compared to overt speech, indicating distinct neural engagement. Left hemisphere activity was notably higher during overt speech across different sentence types. Further machine learning and deep learning analysis were deemed necessary to understand these neural patterns fully. Differences in brain activity patterns included increased overall channel activity during overt speech due to mouth movements. In contrast, speech imagery showed specific activation in left hemisphere channels, highlighting the potential for effective decoding. It suggested intensified activity in areas like Broca's and Wernicke's during imagined speech.
The discussion highlighted distinct brain activity patterns between overt speech and speech imagery, with higher activation observed in the left hemisphere during speech imagery, suggesting a targeted approach to enhance BCI communication tools. Achieving high performance in speech imagery and decoding necessitated developing deep learning models capable of capturing these unique neural patterns. The study emphasized the importance of multiple data recordings to address session dependency. Future plans included designing deep neural active learning methods to improve semantic decoding of EEG signals, aiming for accurate and fluent sentence generation from speech imagination data. The ultimate goal was to advance high-performance, semantic, interactive speech imagery decoding technology for intuitive and scalable BCI applications.
9th Edition of International Conference on Neurology and Neurological Disorders June 20-22, 2024, Paris, France



