IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 20, NO. 4, APRIL 1998 401
A Pixel Dissimilarity Measure That Is
Insensitive to Image Sampling
Stan Birchfield and Carlo Tomasi
Abstract—Because of image sampling, traditional measures of pixel
dissimilarity can assign a large value to two corresponding pixels in a
stereo pair, even in the absence of noise and other degrading effects.
We propose a measure of dissimilarity that is provably insensitive to
sampling because it uses the linearly interpolated intensity functions
surrounding the pixels. Experiments on real images show that our
measure alleviates the problem of sampling with little additional
computational overhead.
Index Terms—Dissimilarity, stereo matching, correspondence.
———————— ✦ ————————
1 INTRODUCTION
WHEN a point in the world is imaged by a stereo pair of cameras,
the intensity values of the corresponding pixels are in general dif-
ferent. Many factors contribute to this difference, such as the fact
that the light reflected off the point is not the same in the two di-
rections, the two cameras have different gains and biases, the in-
tensities of the pixels are quantized, and noise exists in the camera
and framegrabber electronics. Moreover, a pixel value is actually
not the image of a point but of a surface patch, and two pixels that
contain corresponding world points integrate light reflected off
two different surface patches due to foreshortening, depth discon-
tinuities, lens blur, and image sampling.
Although some researchers have proposed measures of pixel
dissimilarity that are insensitive to gain, bias, noise, and depth
discontinuities [6], [9], [10], [11], [13], there seems to be no work on
explicitly achieving insensitivity to image sampling. Yet this latter
phenomenon can significantly change the intensity value of a pixel
where the intensity function is changing rapidly and where the
disparity is not an integral number of pixels (see Fig. 1). Although
this may not be a problem if one is only interested in finding the
best match for a given pixel, it is a problem if a threshold is used to
determine matching failure or if the dissimilarities between the
pixels are added to other quantities.
For example, there has recently emerged a class of stereo algo-
rithms [1], [2], [5], [7], [8] in which epipolar scanlines are matched by
minimizing a cost function that sums the absolute or squared differ-
ences of pixel intensities with penalties for occlusions. With the ex-
ception of [1] and [2], all of these algorithms work at pixel resolu-
tion, and therefore a measure of pixel dissimilarity that is insensitive
to sampling would eliminate the errors that they experience due to
sampling effects [5]. Moreover, because these algorithms explicitly
search over all possible disparities using dynamic programming,
working at subpixel resolution is often infeasible because it results in
an unacceptable increase in the computational burden.
In this paper, we propose a measure of pixel dissimilarity that
compares two pixels using the linearly interpolated intensity
functions surrounding them. However, because it does not explic-
itly reconstruct those functions, the computation required is only
slightly more than that of taking the absolute difference in inten-
sity. Our measure is provably insensitive to sampling and is shown
to improve the results of a stereo algorithm on real images.
The paper is organized as follows. We define the dissimilarity
measure and describe its computation in Section 2. In Section 3, we
present two theorems that guarantee that our measure will exhibit
the desired behavior under certain general conditions, and we
show that it behaves reasonably even when those conditions are
not met. The measure is incorporated into a stereo algorithm to
demonstrate the improved results in Section 4, followed by a dis-
cussion in Section 5 comparing our dissimilarity measure with
working at subpixel resolution.
2 DEFINITION AND COMPUTATION OF DISSIMILARITY
Assume that we have a rectified stereo pair of cameras, so that the
scanlines are the epipolar lines. Along two corresponding scanli-
nes, let i
L
and i
R
be the one-dimensional continuous intensity func-
tions that result from convolving the amount of light incident
upon the two image sensors with a box function whose support is
equal to the width of one pixel. This convolution is due to the fact
that a real image sensor can be modeled as an integration of inten-
sity over each pixel followed by an ideal sampler—thus, to allow
us to concentrate on ideal sampling, we remove the integration at
the outset. The functions i
L
and i
R
are sampled at discrete points by
the ideal sampler of the image sensor, resulting in two discrete
one-dimensional arrays of intensity values, I
L
and I
R
, as shown in
Fig. 2. Our goal is to compute the dissimilarity between a pixel at
position x
L
in the left scanline and a pixel at position x
R
in the right
scanline; the other pixels shown in the figure are adjacent to these
two. First, we define
$
I
R
as the linearly interpolated function be-
tween the sample points of the right scanline, then we measure
how well the intensity at x
L
fits into the linearly interpolated re-
gion surrounding x
R
. That is, we define the following quantity:
dx x I I I x I x
L R L R
x x x
L L R
R
R
, , , min
$
c h ch af = -
- ≤ ≤
+
1
2
1
2
.
Defining
$
I
L
similarly, we obtain a symmetric quantity:
dx x I I I x I x
R L R L
x x x
L R R
L
L
, , , min
$
c h af ch = -
- ≤ ≤
+
1
2
1
2
.
The dissimilarity d between the pixels is defined symmetrically as
the minimum of the two quantities:
dx x dx x I I dx x I I
L R L R L R R L R L
, min , , , , , , , c h c hc h o t
= . (1)
Since the extreme points of a piecewise linear function must be
its breakpoints, the computation of d is straightforward. First, we
compute
I I x I x I x
R R R R R R R
-
≡ -
F
H
G
I
K
J
= + -
$
1
2
1
2
1 ch c h e j
,
the linearly interpolated intensity halfway between x
R
and its
neighboring pixel to the left, and the analogous quantity
I I x I x I x
R R R R R R R
+
≡ +
F
H
G
I
K
J
= + +
$
1
2
1
2
1 ch c h e j
.
Then, we let I I I I x
R R R R min
min , , =
- +
ch o t
and I I I I x
R R R R max
max , , =
- +
ch o t
.
With these quantities defined,
dx x I I I x I I I x
L R L R L L L L
, , , max , ,
max min
c h ch ch o t
= - - 0 .
This computation, along with its symmetric counterpart
dx x I I
R L R L
, , , c h , takes only a small, constant amount of time more
than the absolute difference in intensity. In practice, we have found
the total computing time of our stereo algorithm to increase by less
than 10 percent.
0162-8828/98/$10.00 © 1998 IEEE
²²²²²²²²²²²²²²²²
• The authors are with the Computer Science Department, Stanford University,
Stanford, CA 94305. E-mail: {birchfield; tomasi}@cs.stanford.edu.
Manuscript received 16 May 1997; revised 6 Mar. 1998. Recommended for accep-
tance by R. Szeliski.
For information on obtaining reprints of this article, please send e-mail to:
tpami@computer.org, and reference IEEECS Log Number 106566.