IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 20, NO. 4, APRIL 1998 401 A Pixel Dissimilarity Measure That Is Insensitive to Image Sampling Stan Birchfield and Carlo Tomasi Abstract—Because of image sampling, traditional measures of pixel dissimilarity can assign a large value to two corresponding pixels in a stereo pair, even in the absence of noise and other degrading effects. We propose a measure of dissimilarity that is provably insensitive to sampling because it uses the linearly interpolated intensity functions surrounding the pixels. Experiments on real images show that our measure alleviates the problem of sampling with little additional computational overhead. Index Terms—Dissimilarity, stereo matching, correspondence. ———————— ✦ ———————— 1 INTRODUCTION WHEN a point in the world is imaged by a stereo pair of cameras, the intensity values of the corresponding pixels are in general dif- ferent. Many factors contribute to this difference, such as the fact that the light reflected off the point is not the same in the two di- rections, the two cameras have different gains and biases, the in- tensities of the pixels are quantized, and noise exists in the camera and framegrabber electronics. Moreover, a pixel value is actually not the image of a point but of a surface patch, and two pixels that contain corresponding world points integrate light reflected off two different surface patches due to foreshortening, depth discon- tinuities, lens blur, and image sampling. Although some researchers have proposed measures of pixel dissimilarity that are insensitive to gain, bias, noise, and depth discontinuities [6], [9], [10], [11], [13], there seems to be no work on explicitly achieving insensitivity to image sampling. Yet this latter phenomenon can significantly change the intensity value of a pixel where the intensity function is changing rapidly and where the disparity is not an integral number of pixels (see Fig. 1). Although this may not be a problem if one is only interested in finding the best match for a given pixel, it is a problem if a threshold is used to determine matching failure or if the dissimilarities between the pixels are added to other quantities. For example, there has recently emerged a class of stereo algo- rithms [1], [2], [5], [7], [8] in which epipolar scanlines are matched by minimizing a cost function that sums the absolute or squared differ- ences of pixel intensities with penalties for occlusions. With the ex- ception of [1] and [2], all of these algorithms work at pixel resolu- tion, and therefore a measure of pixel dissimilarity that is insensitive to sampling would eliminate the errors that they experience due to sampling effects [5]. Moreover, because these algorithms explicitly search over all possible disparities using dynamic programming, working at subpixel resolution is often infeasible because it results in an unacceptable increase in the computational burden. In this paper, we propose a measure of pixel dissimilarity that compares two pixels using the linearly interpolated intensity functions surrounding them. However, because it does not explic- itly reconstruct those functions, the computation required is only slightly more than that of taking the absolute difference in inten- sity. Our measure is provably insensitive to sampling and is shown to improve the results of a stereo algorithm on real images. The paper is organized as follows. We define the dissimilarity measure and describe its computation in Section 2. In Section 3, we present two theorems that guarantee that our measure will exhibit the desired behavior under certain general conditions, and we show that it behaves reasonably even when those conditions are not met. The measure is incorporated into a stereo algorithm to demonstrate the improved results in Section 4, followed by a dis- cussion in Section 5 comparing our dissimilarity measure with working at subpixel resolution. 2 DEFINITION AND COMPUTATION OF DISSIMILARITY Assume that we have a rectified stereo pair of cameras, so that the scanlines are the epipolar lines. Along two corresponding scanli- nes, let i L and i R be the one-dimensional continuous intensity func- tions that result from convolving the amount of light incident upon the two image sensors with a box function whose support is equal to the width of one pixel. This convolution is due to the fact that a real image sensor can be modeled as an integration of inten- sity over each pixel followed by an ideal sampler—thus, to allow us to concentrate on ideal sampling, we remove the integration at the outset. The functions i L and i R are sampled at discrete points by the ideal sampler of the image sensor, resulting in two discrete one-dimensional arrays of intensity values, I L and I R , as shown in Fig. 2. Our goal is to compute the dissimilarity between a pixel at position x L in the left scanline and a pixel at position x R in the right scanline; the other pixels shown in the figure are adjacent to these two. First, we define $ I R as the linearly interpolated function be- tween the sample points of the right scanline, then we measure how well the intensity at x L fits into the linearly interpolated re- gion surrounding x R . That is, we define the following quantity: dx x I I I x I x L R L R x x x L L R R R , , , min $ c h ch af = - - ≤ ≤ + 1 2 1 2 . Defining $ I L similarly, we obtain a symmetric quantity: dx x I I I x I x R L R L x x x L R R L L , , , min $ c h af ch = - - ≤ ≤ + 1 2 1 2 . The dissimilarity d between the pixels is defined symmetrically as the minimum of the two quantities: dx x dx x I I dx x I I L R L R L R R L R L , min , , , , , , , c h c hc h o t = . (1) Since the extreme points of a piecewise linear function must be its breakpoints, the computation of d is straightforward. First, we compute I I x I x I x R R R R R R R - ≡ - F H G I K J = + - $ 1 2 1 2 1 ch c h e j , the linearly interpolated intensity halfway between x R and its neighboring pixel to the left, and the analogous quantity I I x I x I x R R R R R R R + ≡ + F H G I K J = + + $ 1 2 1 2 1 ch c h e j . Then, we let I I I I x R R R R min min , , = - + ch o t and I I I I x R R R R max max , , = - + ch o t . With these quantities defined, dx x I I I x I I I x L R L R L L L L , , , max , , max min c h ch ch o t = - - 0 . This computation, along with its symmetric counterpart dx x I I R L R L , , , c h , takes only a small, constant amount of time more than the absolute difference in intensity. In practice, we have found the total computing time of our stereo algorithm to increase by less than 10 percent. 0162-8828/98/$10.00 © 1998 IEEE ²²²²²²²²²²²²²²²² • The authors are with the Computer Science Department, Stanford University, Stanford, CA 94305. E-mail: {birchfield; tomasi}@cs.stanford.edu. Manuscript received 16 May 1997; revised 6 Mar. 1998. Recommended for accep- tance by R. Szeliski. For information on obtaining reprints of this article, please send e-mail to: tpami@computer.org, and reference IEEECS Log Number 106566.