A Machine Learning System for Controlling a Rescue Robot Timothy Wiley 1 , Ivan Bratko 2 , and Claude Sammut 3 1 School of Computer Science and Engineering The University of New South Wales Sydney, Australia {t.wiley,c.sammut}@unsw.edu.au 2 Faculty of Computer and Information Science The University of Ljubljana, Slovenia bratko@fri.uni-lj.si Abstract. Many rescue robots are reconﬁgurable, having subtracks (or ﬂippers) that can be adjusted to help the robot traverse diﬀerent types of terrain. Knowing how to adjust them requires skill on the part of the operator. If the robot is intended to run autonomously, the control system must have an understanding of how the ﬂippers aﬀect the robot’s interaction with the ground. We describe a system that ﬁrst learns the eﬀects of a robot’s actions and then uses this knowledge to plan how to reconﬁgure the robot’s tracks so that it can overcome diﬀerent types of obstacles. The system is a hybrid of qualitative symbolic learning and reinforcement learning. Keywords: Machine Learning, Qualitative Models, Rescue Robots 1 Introduction Driving a wheeled robot on ﬂat ground is relatively straightforward once the robot has a map of its environment. The operator only needs to control the steering and speed. Driving a tracked vehicle over rough terrain is much more diﬃcult, especially if the vehicle has subtracks, or ﬂippers, because the operator must make decisions about the conﬁguration of the ﬂippers, as well as steering and speed. Thus, subtracks give the robot greater terrain traversal capabilities at the expense of greater control complexity. Reconﬁgurable robots are used commonly in urban search and rescue, but are mostly tele-operated. However, remote control is impossible when there is a loss of communication. Therefore, rescue robots need at least enough autonomy to be able to navigate out of a radio dropout zone. The goal of this research is to develop an autonomous driving system for reconﬁgurable robots. Since the interactions of the robot with the terrain in a disaster site are extremely diﬃculty to predict, we use a learning system to build a model of how control actions, including changing ﬂipper angles, aﬀect the robot’s state. Once we have the model, the driving system can plan a sequence of actions to achieve the desired goal state.