978-1-4244-9789-8/11/$26.00 ©2011 IEEE DISASTER MANAGEMENT IN REAL TIME SIMULATION USING MACHINE LEARNING Mohammed Khouj, César López, Sarbjit Sarkaria, José Marti Electrical and Computer Engineering, University of British Columbia Vancouver, BC, Canada {mkhouj, clopez}@ece.ubc.ca, sarbjit.sarkaria@gmail.com, jrms@ece.ubc.ca ABSTRACT A series of carefully chosen decisions by an Emergency Responder during a disaster are vital in mitigating the loss of human lives and the recovery of critical infrastructures. In this paper we propose to assist a human Emergency Responder by modeling and simulating an intelligent agent using Reinforcement Learning. The goal of the agent will be to maximize the number of patients discharged from hospitals or on-site emergency units. It is suggested that by exposing such an intelligent agent to a large sequence of simulated disaster scenarios, the agent will capture enough experience and knowledge to enable it to select those actions which mitigate damage and casualties. This paper describes early results of our work that indicate that the use of Q-learning can successfully train an agent to make good choices, during a simulated disaster. Index Terms—Machine learning, intelligent system, critical infrastructures, real time simulation, disaster response management 1. INTRODUCTION In natural or human-induced emergencies, it is clear that a series of carefully chosen decisions are vital in mitigating death and disaster following a natural catastrophe such as an earthquake. These decisions must be made on the basis of sound knowledge and experience. However, given that the worldwide frequency of such situations is fortunately low and that the likelihood of the same command and control personnel encountering similar scenarios over and over again is slim, it can be appreciated that opportunities to build up the necessary experience are severely limited. This is the context of this paper and we maintain that such decisions need to be carefully studied and pre-measured before implementation. In this research, we propose a model to simulate an intelligent learning agent that is able to sense changes in the surrounding environment, measure the physical operability and the resources availability of critical infrastructures, and take the actions that are needed to mitigate loss, which in the case of disaster victims is the number of discharged patients from hospitals or on-site emergency units. We propose that by exposing an intelligent agent to a large number of simulated disaster scenarios, we will capture sufficient experience to enable the agent to make informed decisions that would lead to the best outcomes in terms of casualties or other damage. This paper will describe our approach to implementing such an agent within the I2Sim system [1]. We document our initial work showing details of how a popular learning methodology known as Reinforcement Learning (RL) can be applied to a small network of simulated infrastructure cells. The details of the application of RL to a disaster scenario modeled in I2Sim are described. Finally we address plans for our future work, which suggest that the application of approximation methods will be necessary when scaling the simulation to more complex scenarios. 2. RELATED WORK The application of Artificial Intelligence (AI) techniques for human decision modeling is not new. Much research has addressed the need to assist decision makers in making the best choice among a number of available options. For instance, various AI approaches including neural networks and experts systems have been used to capture knowledge from human experts experienced in dealing with decisions at a manufacturing plant [2]. Other literature suggests using agent based modeling where agents may execute various behaviors appropriate for the system they represent [3]. The application of AI for modeling human decision making is appealing because it leads to a number of benefits. For example: 1. Time saving and improved effectiveness 2. Ability to make and execute many times 3. Specified events or scenarios can be modeled and tested 4. Minimizing human interventions as much as possible [4] 2.1. Reinforcement Learning Reinforcement Learning (RL) represents a class of learning algorithms in which an agent gains knowledge through interactions with its environment. The mathematical IEEE CCECE 2011 - 001507  