BICS 2008 – Brain Inspired Cognitive Systems Co-Adaptive Learning in Brain-Machine Interfaces Babak Mahmoudi 1 , Jack DiGiovanna 2 , Jose C. Principe 3 , and Justin C. Sanchez 4 1 -2 Department of Biomedical Engineering, University of Florida, Gainesville, FL 32608 USA babakm@ufl.edu, jfd134@ufl.edu 3 Department of Electrical and Computer Engineering, University of Florida, Gainesville, FL 32611 USA principe@cnel.ufl.edu 4 Department of Pediatrics, Division of Neurology, University of Florida, Gainesville, FL 32610 USA jcs77@ufl.edu Abstract- This paper studies the cooperation between an artificial agent and a biological organism to accomplish a goal directed task in context of Reinforcement Learning Brain- machine Interface (RLBMI). The artificial agent’s learning is based on Temporal Difference (TD) learning to adapt model parameters whereas the biological agent’s learning is represented in the temporally specific modulation of brain neural activity. We observed that these two completely different learning mechanisms coexist and contribute to the overall learning of the system in achieving reward which is characterized by a behavioral learning curve criterion. This learning paradigm may introduce a new framework to specify the engineering principles for designing next generation adaptive systems. Index Terms— Brain-Machine Interface, Neuroprosthetic, Reinforcement Learning, Co-adaptation. I. INTRODUCTION Learning through reinforcement is a critical adaptive mechanism that allows animals to shape their behaviors to maximize rewards from the environment [1]. The remarkable aspects of the neural process that support such behavior is that they are capable of responding almost instantaneously in a variety of changing environments. Discovery of the underlying principles of how animals use neural representation and timing has the great potential to specify the engineering architecture of the next generation of adaptive systems. For artificial systems, there have been many developments in the machine learning paradigm known as reinforcement learning (RL) [2-5] in this direction. In the RL framework, the learner (which is called the agent) continually interacts with its environment and after each interaction the agent receives a reward from its environment. The agent tries to maximize earned rewards over time. While RL has been an influential computational theory in neuroscience there are still several aspects that need to be investigated - namely its speed of adaptation, realism of the modelling of brain learning, and its ability to explain a variety of the neural systems involved [6]. One approach for determining in more detail the design principles for cooperative adaptation is to study the direct interaction between brain and machine as is done in neuroprosthetics. The beauty of Brain-Machine Interface (BMI) technology is that adaptive models serve as surrogate communication channels for neural systems. This technology provides a framework to study theories of interactive learning both from engineering and neuroscience perspectives. Much work has already been done in BMIs however from primarily a static input-output modeling framework [7-9] and the concept of co-adaptation [10, 11] has yet to be fully realized. In the pursuit of co-adaptive neural interfaces, we have proposed a BMI paradigm which is based on a modified interpretation of Reinforcement Learning (RLBMI) [12, 13]. In this paradigm an artificial, intelligent agent through interaction with the user’s brain learns how to map neural activity to taking actions which are desirable for the user. Through this process, the artificial agent receives rewards or punishments based upon satisfaction of the user. The user also tries to adapt their neural modulation to achieve a common goal based on the user’s understanding of the agent’s behavior. This co-adaptation opens a new field in interactive learning where synergy among adaptive components can facilitate the learning. The idea of multi agent learning is the intersection between machine learning and multi-agent systems where all of the learners are artificial. The novelty in our approach is the interaction among artificial and biological agents. This paradigm can provide a platform to study the machine learning and biological learning as well as the mutual learning that happens in their interaction. In this paper, we present the co-adaptive learning in the context of an experimental BMI that requires coordination between artificial and biological intelligence to solve a motor task for reaching and grasping. We quantify here the relative speed of adaptation of both the computational model and neural modulation to better specify the engineering of next generation co-adaptive BMIs. II. METHODS A. Computational framework