Robot hand discovery based on visuomotor coherence
Ryo Saegusa, Giorgio Metta, Giulio Sandini
Abstract— This paper proposes a plausible approach for a
robot to discover its own body based on the coherence of two
different sensory feedbacks; vision and proprioception. The
image cues of a moving region are stored in an image base
with a visuo proprioceptional coherence label. The existence
of coherence between the vision and proprioception suggests
that the visually detected object is correlated to its own motor
functions. By making the image base autonomously, a humanoid
robot discovers its own hand without any knowledge of the
hand appearances such as predened visual marker. Also, the
robot keeps tracking the hand with distinguishing it from other
objects. All modules of visual and proprioceptional processing
are distributed in the networks, which allow online perception
and interaction.
I. INTRODUCTION
How can a robot know its own hand? This is a fundamental
question for embodied intelligence and also the early life of
primates. We can recognize our body in general; however
it would be hard to assume that we are programmed to
recognize all of our body elements inherently. Finding our
own body and knowing their sensorimotor functions seem
more cognitive and developmental processes. Our main in-
terest in this work is to realize a human like cognitive system
allowing visuomotor coordination to perceive the self. The
autonomous body discovery is considered essential for robots
to sense the boundary between the body and the environment,
which would be necessary for general object recognition and
visuomotor imitation.
The overview of our approach is depicted in Fig.1. The
principal idea is to simply move a part of the body, here
we assume a robot hand as the target body part, and monitor
the coherence of the visual and proprioceptional feedbacks in
sensing. Every moment the image patches of moving region
are stored in an image base with the visuomotor coherence
label, and visually classied into a certain number of clusters
online. The most motor correlated image cluster, here, should
be the image cluster of the hand. Making use of the image
base, a region of interest in the view is recognized as the hand
or other objects by referring the metric to the representatives
of image clusters.
This paper is organized as follows: Section II describes the
related works. Section III describes the proposed framework
This work was partially supported by the RobotCub project (IST-2004-
004370) funded by the European Commission through the Unit E5 Cognitive
Systems.
R. Saegusa is at Robotics, Brain and Cognitive Sciences Dept.,
Italian Institute of Technology, Via Morego 30, 16163 Genoa, Italy.
ryos@ieee.org, ryo.saegusa@iit.it
G. Metta and G. Sandini are Faculty of LIRA-lab, University of Genoa,
Viale Causa 13, 16145 Genova, Italy, and Robotics, Brain and Cognitive Sci-
ences Dept., Italian Institute of Technology, Via Morego 30, 16163 Genoa,
Italy. pasa@liralab.it, giulio.sandini@iit.it
robot hand
human and object
motor command
proprioceptional feedback
visual feedback
coherence
check
joint motor
motor correlated image
non motor correlated image
robot
non motor
correlated
motor
correlated
image base
Fig. 1. Visuomotor coherence based robot hand discovery. A robot
generates the arm movements, and senses the visual and proprioceptional
feedbacks. When the image of the moving region is coherent to the propri-
oceptional motor sensing, the image is recoded with a label of visuomotor
coherence. After the short term arm movements or interaction with people,
the robot can visually recognize the body parts with distinguishing from
the other objects. The vision system is allowed to track and localize the
detected hand visually online.
and details. Section IV describes the experimental results
with the humanoid robot James [1]. Section V gives the
conclusion and outlines some future tasks.
II. RELATED WORKS
The sensorimotor coordination is well studied in robotics
and there are many excellent works relevant to the author’s
interests; sensory prediction (Wolpert et al.[2], Kawato et
al.[3]) and learning-based motor control (Atkeson et al.[4][5],
Schaal et al.[6]). Also, we have been ambitiously studied
constructive understanding of human cognition with hu-
manoid robots, such as the mirror system (Metta et al.[7])
and production-perception link (Fitzpatrick et al.[8]) involv-
ing neuroscientic aspects and developmental psychology.
Literatures of imitation learning (Schaal et al.[9], Calinon et
al.[10][11]) are also related to this topic.
In studies on sensorimotor coordination, the body detec-
tion is often hand coded with predened detection rules such
as visual markers or knowledge of the body structure. The
predened rules gives robustness in detection to the system
as well as certain limits. Let us assume a manipulation task
using ve robot ngers. Probably we need to suppose how
the robot hand is visually projected on the view, or set up
ve color markers to distinguish each nger. Moreover, when
we challenge tool manipulation, the hand detection becomes
more difcult if the robot still depends on the predened
knowledge.
978-1-4244-4775-6/09/$25.00 © 2009 IEEE. 1732
Proceedings of the 2009 IEEE
International Conference on Robotics and Biomimetics
December 19 -23, 2009, Guilin, China