Learning to Cooperate with Human Evaluative Feedback and Demonstrations Mehul Verma a,1 and Erman Acar a,b a Vrije Universiteit Amsterdam, The Netherlands b LIACS, Universiteit Leiden, The Netherlands Abstract. Cooperation is a widespread phenomenon in nature that has also been a cornerstone in the development of human intelligence. Understanding coopera- tion, therefore, on matters such as how it emerges, develops, or fails is an impor- tant avenue of research, not only in a human context, but also for the advancement of next generation artificial intelligence paradigms which are presumably human- compatible. With this motivation in mind, we study the emergence of cooperative behaviour between two independent deep reinforcement learning (RL) agents pro- vided with human input in a novel game environment. In particular, we investigate whether evaluative human feedback (through interactive RL) and expert demon- stration (through inverse RL) can help RL agents to learn to cooperate better. We report two main findings. Firstly, we find that the amount of feedback given has a positive impact on the accumulated reward obtained through cooperation. That is, agents trained with a limited amount of feedback outperform agents trained with- out any feedback, and the performance increases even further as more feedback is provided. Secondly, we find that expert demonstration also helps agents’ per- formance, although with more modest improvements compared to evaluative feed- back. In conclusion, we present a novel game environment to better understand the emergence of cooperative behaviour and show that providing human feedback and demonstrations can accelerate this process. Keywords. Multiagent Reinforcement Learning, Multiagent Cooperation, Inverse Reinforcement Learning, Interactive Reinforcement learning. 1. Introduction While artificial intelligence (AI) technologies are playing more important roles in our daily lives than ever, designing intelligent systems which can work with humans more effectively (instead of replacing them) is becoming a central research challenge [1,2,3,4]. This is mostly pronounced as combining human and machine intelligence, aiming to benefit from the strengths of both in solving problems in various scenarios. Developing such systems requires fundamentally novel solutions to major research problems in AI: It is not secret that the current AI systems outperform humans in many cognitive tasks from pattern recognition [5] or in playing video games [6], yet they fall short when it comes to tasks such as causal modelling, common sense reasoning, and behavioural human capabilities such as explaining its own decisions, adapting to differ- 1 Corresponding author: Mehul Verma, email address: mehuljan26@gmail.com HHAI2022: Augmenting Human Intellect S. Schlobach et al. (Eds.) © 2022 The authors and IOS Press. This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License 4.0 (CC BY-NC 4.0). doi:10.3233/FAIA220189 46