Decision-theoretic Robot Guidance for Active Cooperative Perception Abdolkarim Pahliani Matthijs T. J. Spaan Pedro U. Lima Abstract—We consider the problem of sensor-aware path planning for a robot in a Networked Robot System, in particular in urban environments equipped with a network of surveillance cameras. A robot can use observations from the camera network to improve its own localization performance, but also needs to take into account the specifics of its local sensors. We model our problem in the Markov Decision Process framework, which forms a natural way to express concurrent and possibly conflicting objectives – such as reaching a goal quickly, keeping the robot localized, keeping the target in sight – each with their own priority. We show how we can successfully prioritize the different objectives in a flexible way by changing the reward function, based on the sensory needs of the system. I. I NTRODUCTION Robots are leaving the research labs and operating more often in human-inhabited environments, such as urban pedes- trian areas. The main idea of the URUS (Ubiquitous Net- working Robotics In Urban Settings) Project [1], [2] is to incorporate a network of intelligent components, e.g., robots, sensors, devices and communications in order to improve quality of life in urban areas. The scenario we consider in our work is a group of robots assisting humans in a car-free area, a so-called Networked Robot System (NRS). The pedestrian area in which the robots operate is equipped with surveillance cameras providing the robot with more information. Imple- menting such a system requires addressing many scientific and technological challenges such as cooperative localization and navigation, map building, human-robot interaction, and wireless networking, to name but a few. In this paper, we focus on one particular problem, namely how to plan paths for robots taking into account the coverage of the camera network as well as the robots’ own sensors. In many NRS, surveillance cameras will run a set of event detection algorithms, for instance observing events such as people waving, people lying on the floor, fires, or other emergencies, each with a different priority. However, the network of cameras will have a limited coverage and accuracy. In particular, the environment might contain blind spots that are not observed by any fixed camera. As such, though the camera network is supposed to cover the scene, employing mobile robots for visual coverage is a need. A camera network might cover a lab environment, but providing full coverage for urban environments is a difficult task. There are often obstacles both natural and man-made in This work is supported by the European Project FP6-2005-IST-6-045062- URUS and by Fundação para a Ciência e a Tecnologia (ISR/IST pluriannual funding) through the POS_Conhecimento Program that includes FEDER funds, and by grant SFRH/BD/23394/2005. The authors are with the Institute for Systems and Robotics at Instituto Superior Técnico, Technical University of Lisbon, Portugal. {apahliani,mtjspaan,pal}@isr.ist.utl.pt the environment which make parts of the environment hidden from the camera network. Even if we could employ a large number of cameras to have an environment fully in view, dynamic obstacles still can create new hidden patches. Furthermore, other areas might be observed by a camera, but not with sufficient resolution for accurate event detec- tions. In this case, we send mobile robots to positions where higher resolution images are required. In NRS the interaction between the system and humans will largely be achieved through human-robot interaction, which in general requires a robot to be close to a human subject. In this work, we consider the problem of a robot planning a path to reach a target location. For instance, consider a situation where a robot needs to reach a human for interaction purposes. The robot should take into account available sensory ca- pabilities provided by a robot’s mounted sensors as well as by the network of surveillance cameras. In particular, a robot can use observations from the camera network for its own localization, or take into account the specifics of its mounted sensors to plan an approach to a target location that maximizes the information its sensors will give it about the target. We use a Markov Decision Process (MDP) framework to address our sensor-aware path planning problem [3], [4]. A decision-theoretic framework such as the MDP forms a nat- ural way to express concurrent and possibly conflicting ob- jectives such as reaching the goal quickly, keeping the robot localized, keeping the target in sight, each with their own priority. Given the partially observable nature of the problem, modeling it as a partially observable MDP (POMDP) would be appropriate. However, given the scale and level of detail of the problems we are targeting, with many states, and, more importantly, a large number of possible observations and a high planning horizon, this is beyond current state-of-the-art approximate POMDP planners. II. RELATED WORK In related work, the Coastal Navigation algorithm models the problem of navigating a robot while keeping localization uncertainty low as a POMDP [5]. It converts the POMDP into an augmented MDP, which has an extended state space composed of robot locations and discretized entropy levels. The entropy is used as a measure for the uncertainty of the robot’s localization. In our case, we keep the size of the state space constant, focusing on modifying the reward function instead. This is a flexible way of incorporating different objectives, beyond only caring about the robot’s localization certainty: we also consider the visibility of the target by the robot. Keeping a constant state space allows for quick