Robot hand discovery based on visuomotor coherence Ryo Saegusa, Giorgio Metta, Giulio Sandini Abstract— This paper proposes a plausible approach for a robot to discover its own body based on the coherence of two different sensory feedbacks; vision and proprioception. The image cues of a moving region are stored in an image base with a visuo proprioceptional coherence label. The existence of coherence between the vision and proprioception suggests that the visually detected object is correlated to its own motor functions. By making the image base autonomously, a humanoid robot discovers its own hand without any knowledge of the hand appearances such as predened visual marker. Also, the robot keeps tracking the hand with distinguishing it from other objects. All modules of visual and proprioceptional processing are distributed in the networks, which allow online perception and interaction. I. INTRODUCTION How can a robot know its own hand? This is a fundamental question for embodied intelligence and also the early life of primates. We can recognize our body in general; however it would be hard to assume that we are programmed to recognize all of our body elements inherently. Finding our own body and knowing their sensorimotor functions seem more cognitive and developmental processes. Our main in- terest in this work is to realize a human like cognitive system allowing visuomotor coordination to perceive the self. The autonomous body discovery is considered essential for robots to sense the boundary between the body and the environment, which would be necessary for general object recognition and visuomotor imitation. The overview of our approach is depicted in Fig.1. The principal idea is to simply move a part of the body, here we assume a robot hand as the target body part, and monitor the coherence of the visual and proprioceptional feedbacks in sensing. Every moment the image patches of moving region are stored in an image base with the visuomotor coherence label, and visually classied into a certain number of clusters online. The most motor correlated image cluster, here, should be the image cluster of the hand. Making use of the image base, a region of interest in the view is recognized as the hand or other objects by referring the metric to the representatives of image clusters. This paper is organized as follows: Section II describes the related works. Section III describes the proposed framework This work was partially supported by the RobotCub project (IST-2004- 004370) funded by the European Commission through the Unit E5 Cognitive Systems. R. Saegusa is at Robotics, Brain and Cognitive Sciences Dept., Italian Institute of Technology, Via Morego 30, 16163 Genoa, Italy. ryos@ieee.org, ryo.saegusa@iit.it G. Metta and G. Sandini are Faculty of LIRA-lab, University of Genoa, Viale Causa 13, 16145 Genova, Italy, and Robotics, Brain and Cognitive Sci- ences Dept., Italian Institute of Technology, Via Morego 30, 16163 Genoa, Italy. pasa@liralab.it, giulio.sandini@iit.it robot hand human and object motor command proprioceptional feedback visual feedback coherence check joint motor motor correlated image non motor correlated image robot non motor correlated motor correlated image base Fig. 1. Visuomotor coherence based robot hand discovery. A robot generates the arm movements, and senses the visual and proprioceptional feedbacks. When the image of the moving region is coherent to the propri- oceptional motor sensing, the image is recoded with a label of visuomotor coherence. After the short term arm movements or interaction with people, the robot can visually recognize the body parts with distinguishing from the other objects. The vision system is allowed to track and localize the detected hand visually online. and details. Section IV describes the experimental results with the humanoid robot James [1]. Section V gives the conclusion and outlines some future tasks. II. RELATED WORKS The sensorimotor coordination is well studied in robotics and there are many excellent works relevant to the author’s interests; sensory prediction (Wolpert et al.[2], Kawato et al.[3]) and learning-based motor control (Atkeson et al.[4][5], Schaal et al.[6]). Also, we have been ambitiously studied constructive understanding of human cognition with hu- manoid robots, such as the mirror system (Metta et al.[7]) and production-perception link (Fitzpatrick et al.[8]) involv- ing neuroscientic aspects and developmental psychology. Literatures of imitation learning (Schaal et al.[9], Calinon et al.[10][11]) are also related to this topic. In studies on sensorimotor coordination, the body detec- tion is often hand coded with predened detection rules such as visual markers or knowledge of the body structure. The predened rules gives robustness in detection to the system as well as certain limits. Let us assume a manipulation task using ve robot ngers. Probably we need to suppose how the robot hand is visually projected on the view, or set up ve color markers to distinguish each nger. Moreover, when we challenge tool manipulation, the hand detection becomes more difcult if the robot still depends on the predened knowledge. 978-1-4244-4775-6/09/$25.00 © 2009 IEEE. 1732 Proceedings of the 2009 IEEE International Conference on Robotics and Biomimetics December 19 -23, 2009, Guilin, China