X. Chen et al. (Eds.): RoboCup 2012, LNAI 7500, pp. 237–248, 2013. © Springer-Verlag Berlin Heidelberg 2013 A Distributed Cooperative Reinforcement Learning Method for Decision Making in Fire Brigade Teams Abbas Abdolmaleki 1,2 , Mostafa Movahedi, Nuno Lau 1,3 , and Luís Paulo Reis 2,4 1 IEETA – Institute of Electronics and Telematics Engineering of Aveiro, Portugal 2 LIACC – Artificial Intelligence and Computer Science Lab., Porto, Portugal 3 UA – University of Aveiro, Campus Universitário de Santiago, 3810 193 Aveiro, Portugal 4 EEUM - School of Engineering, University of Minho - DSI, Portugal Campus de Azurém 4800-058 Guimarães, Portugal {Abbas.a,nunolau}@ua.pt, mr.mos.movahedi@gmail.com, lpreis@dsi.uminho.pt Abstract. Decision making in complex, multi-agent and dynamic environments such as disaster spaces is a challenging problem in Artificial Intelligence. This research paper aims at developing distributed coordination and cooperation method based on reinforcement learning to enable team of homogeneous, autonomous fire fighter agents, with similar skills to accomplish complex task allocation, with emphasis on firefighting tasks in disaster space. The main contribution is applying reinforcement learning to solve the bottleneck caused by dynamicity and variety of conditions in such situations as well as improving the distributed coordination of fire fighter agent’s to extinguish fires within a disaster zone. The proposed method increases the speed of learning; it has very low memory usage and has a good scalability and robustness in the case that the number of agents and complexity of task increases. The effectiveness of the proposed method is shown through simulation results. Keywords: RoboCup Rescue Simulation, Multi agent system, Fire Brigade, Decision Making, Reinforcement Learning. 1 Introduction Several authors have proposed general models for flexible coordination of agents. However, most of the approaches either are not sufficiently reactive to perform efficiently in real time and dynamic domains or do not provide agents with sufficiently developed social behavior to perform intelligently as members of a team in continuous, multi-objective and complex multi-agent environments [1]. Notable research may be recognized in Stone’s and Veloso’s work [2] that has been applied with success to RoboCup soccer and network routing. In order to achieve complex behavior acquisition using machine learning methods, Stone and Veloso [3] proposed to introduce a layered learning system with basic skills such as “shootGoal”, “shootAway”, “dribbleBall”, and so on. Kleiner et al [4] also proposed a multi-layered learning system for behavior acquisition of a soccer robot.