Abstract
Brain-machine interfaces (BMIs) interpret dynamic neural activity into movement intention without patients’ real limb movements. In a closed-loop BMI scenario, the environment provides external information (reward or sensory feedback) to the subject to indicate the status of the neuro-prosthesis (e.g., computer cursor), and the subject correspondingly adjusts the neural activities to control the external device to obtain the future reward. Existing BMI tasks usually are pre-defined by the experts, simple to accomplish, and demand a large amount of neural data resources. The ideal BMI systems should enable the subjects to learn new tasks autonomously following their intentions and accomplish more complicated tasks in an online framework.First, we propose an internally rewarded reinforcement learning (RL) framework during task learning to reflect the true intention of the subject. Information from the brain areas that participated in learning was not fully utilized. Current BMIs mainly use motor-related areas like the Primary motor cortex (M1) and premotor areas. Actually, the brain involves a reward-guided learning mechanism, but this mechanism has not been utilized in autonomous BMIs. We extract the internal representation of the reward from the medial prefrontal cortex (mPFC) and leverage it to train the RL-based decoder. Our proposed framework achieves similar performance as the externally rewarded decoder and addresses the time-variant neural patterns as they change rapidly during task learning.
Second, we propose an intermediate sensory feedback-assisted RL framework for the multi-step task. When the BMI tasks get more complicated, it is hard to learn the neural-action mappings before the subjects quit. We utilize mPFC neural activities when the subject receives sensory feedback to generate intermediate guidance. Instead of only using the final reward, we can update the decoder with this evaluation information during the trial. Moreover, we propose to embed a temporal difference (TD) method into a kernel RL structure to explore the state-action mapping in the complicated BMI scenarios effectively. Our framework achieves a faster and better decoding performance than the state-of-the-art methods. The results reveal the possibility of achieving better multi-step decoding performance for more complicated BMI tasks.
Third, we propose a task engagement-assisted continuous RL-based decoding framework to improve online learning efficiency. Task engagement of the subject varies trial-to-trial during learning, while this information has seldom been used for single-trial analysis. To what degree the subjects are engaged in the task can guide the decoder to learn the neural-action mapping flexibly. In this work, the biomarker, which indicates task engagement, can be extracted from the mPFC. Moreover, we utilize the estimated task engagement to modulate the continuous RL decoder training. The results show that using the estimated task engagement from the mPFC helps improve the decoding performance and better reconstruct the trajectory in online continuous decoding tasks.
| Date of Award | 2022 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Yiwen WANG (Supervisor) |
Cite this
- Standard