An Intensity-augmented Ordinal Measure for Visual Correspondence Anurag Mittal Real Time Vision and Modeling * Siemens Corporate Research Princeton, NJ 08540 Visvanathan Ramesh Real Time Vision and Modeling Siemens Corporate Research Princeton, NJ 08540 visvanathan.ramesh@siemens.com Abstract Determining the correspondence of image patches is one of the most important problems in Computer Vision. When the intensity space is variant due to several factors such as the camera gain or gamma correction, one needs methods that are robust to such transformations. While the most common assumption is that of a linear transformation, a more general assumption is that the change is monotonic. Therefore, methods have been developed previously that work on the rankings between different pixels as opposed to the intensities themselves. In this paper, we develop a new matching method that improves upon existing methods by using a combination of intensity and rank information. The method considers the difference in the intensities of the changed pixels in order to achieve greater robustness to Gaussian noise. Furthermore, only uncorrelated order changes are considered, which makes the method robust to changes in a single or a few pixels. These properties make the algorithm quite robust to different types of noise and other artifacts such as camera shake or image compres- sion. Experiments illustrate the potential of the approach in several different applications such as change detection and feature matching. 1. Introduction Determining the correspondence of image patches is one of the most important problems in Computer Vision, with applications to stereo matching, change detection, optical flow, image registration etc. In many of these applications, such matching has to be performed under many possible intensity changes occuring due to the change in camera gain and offset, gamma correction, illumination changes etc. In order to achieve invariance to such factors, most meth- * The author is currently with the Department of Computer Science and Engg., Indian Institute of Technology Madras, Chennai, INDIA - 600036. He may be reached at amittal@cse.iitm.ernet.in. ods assume a linear transformation model such that the same change in the illumination in different nearby pix- els or color components creates the same proportion of change in the intensity observed at these pixels. Nor- malized cross-correlation is one such commonly consid- ered distance measure between two image patches that is invariant to a linear change in intensity. More com- plex methods utilize a variety of different filters and tech- niques in order to achieve invariance to a linear intensity change[15, 12, 2, 5, 7, 8, 16, 18, 19, 17]. The normaliza- tion can also be performed in the spectral space by methods such as normalized color, including its variations that utilize a robust matching technique[10]. However, it is well-known[1, 21, 6] that many image transformations are non-linear in nature: gamma correc- tion causes non-linearity, the camera response function is not linear near saturation and low-light, small specular re- flection or dust/rain/snow peckles can change some pixels, and different parts of an object may be illuminated differ- ently. One method that has been considered to handle such changes in the visual space is that of mutual information[20] that can handle a complete change in the image intensities. While such an approach can be used[14], the most restric- tive assumption that is able to handle the actual image trans- formation should be used for best performance. An assump- tion that is more appropriate in many circumstances is that the changes are monotonic. Many methods have utilized this assumption in order to achieve techniques that are more robust under these more general changes. Most of these methods transform the feature space such that only the “or- der” of a particular pixel in relation to its neighbors is con- sidered. The census transform[22] looks at all the neighbors of a given pixel and creates a vector from the order of this pixel with respect to the neighbors. Image matching can then be performed by correlation in this transformed space. Bhat and Nayar[4] improved upon such measure by a carefully designed distance between two rank permutations. While a single pixel error can cause disproportionate error in the census algorithm, this method counts such changes 0-7695-2646-2/06 $20.00 (c) 2006 IEEE