Eye-in-Hand/Eye-to-Hand Multi-Camera Visual Servoing
Vincenzo Lippiello, Bruno Siciliano, Luigi Villani
Abstract— A position-based visual servoing algorithm using
an hybrid eye-in-hand/eye-to-hand multi-camera configuration
is presented in this paper. Based on an extended Kalman filter,
this approach exploits the data provided by all the cameras
without “a priori” discrimination, allowing real-time object
pose estimation. A suitable algorithm is in charge of selecting
an optimal subset of image features on the basis of the desired
task and of the current configuration of the workspace. Only
this subset is considered for feature extraction, thus ensuring
a computational cost independent of the number of cameras.
Experimental results are reported to demonstrate the feasibility
and the effectiveness of the proposed technique.
I. I NTRODUCTION
The adoption of visual feedback for closed-loop control
of robot manipulators is becoming a common practice both
in research and in industrial areas. This approach is known
as visual servoing. Moreover, the increase in the perfor-
mance/cost ratio of machine vision is opening new scenarios
where multi-camera systems are employed (see [1] and [2]).
The two most adopted camera configurations are known as
eye-in-hand, where one or more cameras are rigidly attached
to the robot end effector, and eye-to-hand, where the cameras
are fixed in the workspace [3]. The first one guarantees good
accuracy and the ability to explore the workspace although
with a limited sight; the second one ensures a panoramic
sight of the workspace, but a lower accuracy. Hence, the use
of both configurations at the same time makes the execution
of complex tasks easier and offers higher flexibility in the
presence of a dynamic scenario.
Recently, some effort has been made to design visual
servoing systems based on hybrid eye-in-hand/eye-to-hand
camera configurations. In [4] an eye-to-hand camera is in
charge of the robot tool positioning while an eye-in-hand
camera is in charge of the robot tool orientation. A similar
approach is used in [5], where an eye-to-hand camera is
employed to estimate the robot tool pose with respect to the
workspace and an eye-in-hand camera is employed as data
source for object pose estimation. Further, in [6], a camera
mounted on the end effector of a robot has been adopted
as an eye-to-hand camera for another robot to benefit of the
advantages of a mobile camera.
All the above approaches do not fully exploit the potential-
ities of hybrid camera configurations. In fact, the information
provided by different types of cameras (fixed or mobile) is
employed for different goals. Hence, a complete integration
is not really achieved. Moreover, the possibility to adopt a
The authors are with PRISMA Lab, Dipartimento di Informatica e
Sistemistica Universit` a degli Studi di Napoli Federico II Via Claudio 21,
80125 Napoli, Italy {lippiell,siciliano,lvillani}@unina.it
multi-camera visual system for both camera configurations
is not considered.
In this work, a new approach based on the Extended
Kalman Filter (EKF) is proposed to achieve a complete
data fusion in a multi-camera eye-in-hand/eye-to-hand visual
system. This approach allows the data provided by all the
cameras to be used at the same time, without any kind of
“a priori” discrimination. A suitable image-feature selection
algorithm is in charge of dynamically selecting the data
required for the execution of a specific task depending
on the current configuration of the workspace. Only the
selected features are grabbed and elaborated to achieve the
measurements, and thus the computational time spent for
image processing is independent of the number of cameras.
The Kalman filter computes the estimate of the pose of
an object in motion in the visible workspace, which is fed
back to a position-based visual servoing algorithm. Since
the frequency of the pose estimation algorithm is limited by
the camera frame rate (25–60 Hz), while a higher control
bandwidth (more than 100 Hz) is required to guarantee
stability and disturbance rejection for position control of a
robot manipulator, an “indirect” visual servoing algorithm
is implemented [3]. This scheme is based on an inner/outer
feedback loop where the inner position feedback loop runs
at a frequency higher than the outer visual feedback loop.
This paper is organized as follows. In Section II the model
of the visual system and of the workspace is presented.
The formulation of the EKF is illustrated in Section III. In
Section IV the pose estimation algorithm is described, and
the position-based visual servoing control scheme is briefly
outlined. Experimental results for the case of two robots
performing a vision-guided master/slave trajectory following
task are presented in Section V.
II. MODELING
Consider a system of n
f
video cameras fixed in the
workspace (eye-to-hand cameras) and n
m
video cameras
mounted on the end effector of one or more robots (eye-in-
hand cameras), with n = n
f
+ n
m
. The geometry of the sys-
tem with respect to a generic camera can be described using
the classical pinhole model (see Fig. 1). In the following, the
symbols F and M will denote the set of eye-to-hand cameras
and eye-in-hand cameras respectively; moreover, the index ci
will be used to denote the quantities referred to the camera
frame ci. For each camera, a frame O
ci
–x
ci
y
ci
z
ci
attached to
the camera ci is considered, with the z
ci
-axis aligned to the
optical axis and the origin in the optical center. The sensor
plane is parallel to the x
ci
y
ci
-plane at a distance -λ
ci
e
along
the z
ci
-axis, where λ
ci
e
is the effective focal length of the
Proceedings of the
44th IEEE Conference on Decision and Control, and
the European Control Conference 2005
Seville, Spain, December 12-15, 2005
WeB15.1
0-7803-9568-9/05/$20.00 ©2005 IEEE
5354