Visual servoing from robust direct color image registration
Geraldo Silveira and Ezio Malis
Abstract—To date, there exist only few works on the use
of color images for visual servoing. Perhaps, this is due to
the difficulties usually found to cope with illumination changes
in these images. This paper presents new parametric models
and optimization methods for robustly and directly registering
color images. Direct methods refer to those that exploit the
pixel intensities, without resorting to image features. We then
show how a robust and generic visual servoing scheme can
be constructed using the obtained optimal parameters. The
proposed models ensure robustness to arbitrary illumination
changes in color images, do not require prior knowledge (in-
cluding the spectral ones) of the object, illuminants or camera,
and naturally encompass gray-level images. Furthermore, the
exploitation of all information within the images, even from
areas where no features exist, allow the algorithm to achieve
high levels of accuracy. Various results are reported to show
that visual servoing can indeed be highly accurate and robust
despite unknown objects and unknown imaging conditions.
I. INTRODUCTION
Visual tracking of an object of interest can be formulated
as an image registration problem. Image registration consists
in estimating the transformations that best align a reference
image to a second one. Generally, they can be classified
into feature-based methods or direct methods [1]. Feature-
based methods require extracting and matching a set of
features (e.g., points, lines) from the two images. Since they
may afford relatively larger displacements of the object in
the field-of-view, feature-based methods are suitable when
the two images are taken under disparate viewpoints. In
turn, direct methods exploit the pixel intensities without
having to rely on image features. They can then be highly
accurate mainly owing to the exploitation of all possible
image information, even from areas where no features exist.
On the other hand, direct methods assume that the two
images of the object have a sufficient overlapping [2]. Since
this paper considers real-time vision-based robot control [3],
we can suppose that the frame rate is sufficiently high such
that only relatively small inter-frame displacements of the
object are observed. Moreover, high accuracy is often needed
for robot positioning applications. Thus, we focus in this
article on direct registration methods of color images and
their integration in visual servoing schemes, e.g., [4]. Note
however that the parameters estimated by image registration
methods can in fact be used in a variety of visual servoing
techniques, e.g., [5].
Geraldo Silveira is with CTI Renato Archer – Division DRVC, Rod.
Dom Pedro I, km 143,6, Amarais, CEP 13069-901, Campinas/SP, Brazil,
Geraldo.Silveira@cti.gov.br
Ezio Malis is with INRIA Sophia-Antipolis – Project ARobAS, 2004
Route des Lucioles, BP 93, 06902 Sophia-Antipolis Cedex, France,
Ezio.Malis@sophia.inria.fr
(a) (b)
Fig. 1. (a) Original color image and (b) after its conversion to gray-scale.
Almost all information has been lost in this example. This illustrates the
need to work with the color image directly. Please print in color so as to
see how rich the original image is!
To our knowledge, only few techniques on the use of color
images in a visual servoing scheme have been proposed to
date. Perhaps, this is due to the difficulties usually found to
adequately cope with illumination changes in color images.
Another possible reason is that one may think that the use of
color images do not contribute so much to the final precision
of the servoing. This is not always true, and extreme cases
exist where all visual information is lost when gray-scale
cameras are used (see Fig. 1). Even if this is an unlikely
situation in practice, we can conjecture that in many cases
color cameras provide much richer information than their
gray-scale counterparts. Therefore, their application should
be studied in more depth.
Color cameras, like the human eye, are generally (but
not always) trichromatic. In this case each pixel of a color
image is a three-vector, one component per sensor channel.
An active research topic concerns color constancy, which
seeks illuminant-invariant color descriptors. A closely related
problem is to find illuminant-invariant relationships between
color vectors. Given two images of a Mondrian world
1
under
specific conditions,
2
the results presented in [6] claim that a
multiplication of each tristimulus value (in an appropriate ba-
sis) by a scale factor is sufficient to support color constancy
in practice. This framework has been exploited in color-
based point tracking, e.g., [7], and has also been applied
in [8] to the control of a pan-tilt (i.e., 2 dofs) by finding
the centroid of a red object. An effective technique also to
find the centroid of an object in color images is through
mean-shift [9]. However, these methods are not enough to our
purposes since we are interested in accurately and robustly
controlling all 6 dofs of a robot end-effector.
In this paper, we propose new models and methods to
overcome the limitations of both the Mondrian world
1
and
1
A Mondrian is a planar surface composed of only Lambertian patches,
and is after Piet Mondrian (1872-1944) whose paintings are similar.
2
For example, the light that strikes the surface has to be of uniform
intensity and spectrally unchanging, no inter-reflections, etc.
The 2009 IEEE/RSJ International Conference on
Intelligent Robots and Systems
October 11-15, 2009 St. Louis, USA
978-1-4244-3804-4/09/$25.00 ©2009 IEEE 5450