Q-Learning based Market-Driven Multi-Agent Collaboration in Robot Soccer Hatice K¨ose, Utku Tatlıdede, C ¸etinMeri¸cli, Kemal Kaplan and H. Levent Akın Bo˘gazi¸ ci University Department of Computer Engineering 34342 Bebek, Istanbul, TURKEY Abstract. This work proposes a novel approach for introducing market-driven multi-agent collaboration strategy with Q-Learning based behavior assignment mechanism to the robot soccer domain in order to solve issues related to multi- agent coordination. Robot soccer diﬀers from many other multi-agent problems with its highly dynamic and complex nature. Market-driven approach applies the basic properties of free market economy to a team of robots, to increase the proﬁt of the team as much as possible. For the beneﬁt of the team, robots should work collaboratively, whenever possible. Through Q-learning, a more successful behavior assignment policy have been achieved after a set of training games and the team with learned strategy is shown to be better than the original purely market-driven team. 1 Introduction Although the utilization of multi-robot teams has become popular recently, as their performance are shown to be better, more reliable and more ﬂexible than single robots, in a variety of tasks, the applications are still limited. Problems in the coordination of the robots, eﬃcient usage of limited resources and communication burden discourages researchers to work on real-time problems with dynamic environments. Robot Soc- cer is a good testbed for multiagent collaboration in real-time, complex and dynamic environments. Recently market-driven approach was introduced as an alternative method for robot coordination in Dias and Stenz [1]. It is highly robust and avoids single point failure, while increasing the team performance considerably. There are several applications of market-driven approach. The work in Zlot et al. [2] introduces the approach to multi- robot exploration. In Gerkey and Mataric [3] a work on auction based multi-robot coordination is presented. These implementations seem to work well but limited due to the static nature of the environment. Domains like agricultural areas are simple, static and do not require fast task allocation, planning and coordination as in robot soccer. InK¨ose et al. [4] and [5], a market based algorithm is used for multi-robot coordi- nation for robot soccer. Although the algorithm works well, a more ﬂexible approach could be achieved by embedding a learning policy for providing an adaptive approach