A human body model initialization approach made real-time capable through heuristic constraints M. L¨ osch, S. G¨ artner, S. Knoop, S. R. Schmidt-Rohr and R. Dillmann Institute for Anthropomatics University of Karlsruhe, Germany {loesch|gaertner|knoop|srsr|dillmann}@ira.uka.de Abstract— Current research in service robotics is more and more aimed at applications in real home environments. In such context, the ability to track and understand human movements is very important for a robot, for human-robot-interaction as well as other purposes, e.g. proactive behavior, gestures and motions are an important channel of informatoin about the humans intentions. Before actual motion tracking can take place, it is necessary to initialize the tracking system with a hypothesis about the position and pose of the person who shall be tracked. For collaboration with humans in an unknown environment, the system should perform this step automatically. Therefore, we propose an approach to initialize a usable model of a human standing in front of the system by determining the position and height of a human from its silhouette with a cascade of simple metrics, e.g. compactness and position of the neck. I. INTRODUCTION Future service robots which are meant to interact and cooperate closely with humans need a way to understand movements, actions and intentions of the people they interact with, as depicted in Fig. 1, in particular if such envisioned robots are to be used in normal homes by people who are not familiar with robots. The need for understanding by the robot arises as the robot has to plan its own actions, and needs to be able to predict the global plan and the intentions of its human partner in a cooperative context. Although parts of this knowledge can be communicated by speech, it seems natural to convey parts of this information via other channels such as gestures and the observation of motions and actions of the human. For this purpose, we have developed a human tracking system based on an Iterative Closest Points (ICP) algorithm, which allows a robust tracking as long as the system has been initialized with an appropriate model of the human. As this requirement of an initial pose is an essential limitation for an autonomous robot, we have tackled the problem to automatically derive an initial body model from sensor data. In this paper, we present a new approach to this initialization problem by ﬁltering extracted silhouette data through a cascade of different metrics describing constraints for human silhouettes. The resulting algorithm has three properties that make it adequate for the use on an autonomous robot. It is This work has been partially conducted within the German collabora- tive research center 588 “Humanoid Robots - Learning and Cooperating Multimodal Robots” granted by Deutsche Forschungsgemeinschaft and within the ECs Integrated Project DEXMART in the Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 126239. Fig. 1. Example for interaction between human and robot. usable in real-time, and it is stable with respect to human clothes and robust against different lighting conditions be- cause of the used input data, which is not color dependent. We will introduce the current state of the art in section II followed by an overview of the tracking into which the initialization is integrated. The initialization algorithm is presented in section IV, followed by evaluation results and a conclusion. II. STATE OF THE ART A variety of sensors and models have been used in the attempt to observe and track human movements. The used sensors range from invasive sensors which are ﬁxed to the human body (e.g. magnetic ﬁeld trackers [1], [2] ) over multi- sensor fusion approaches [3] to approaches based on vision systems, where only the last mentioned approach seems feasible for human-robot interaction in every-day life. Tracking of human or human body parts based on vision is a very active research ﬁeld [4]. A lot of different approaches to solve this problem exist, from simple 2d approaches such as skin color segmentation [5] or background subtraction techniques up to complex reconstructions of the human body pose in 3d, e.g. using a particle ﬁlter as in [6] or by model ﬁtting in stereo images as in [7]. Demirdjian presents in [8] an ICP-based approach with a mathematical modelling of joint constraints. Although this approach seems to remove the effect of the ICP when enforcing joint constraints between the modelled limbs, it inspired the used tracking system described in section III. All these methods depend on a good initialization of the used model before the actual tracking can take place.