STEREO REGISTRATION USING KERNEL DENSITY CORRELATION Maneesh Singh, Ashish Jagmohan and Narendra Ahuja Department of Electrical and Computer Engineering University of Illinois at Urbana-Champaign Urbana, Illinois, USA email: {msingh,jagmohan,ahuja}@vision.ai.uiuc.edu ABSTRACT A common approach to solving the stereo registration prob- lem is to model the disparity function as a discrete-valued Markov Random Field. The key problems with this ap- proach are its combinatorial computational complexity, and the discretization of the obtained disparity estimates. In this paper, we propose a framework that addresses the re- quirements of a robust continuous domain formulation for stereo registration. The proposed formulation is based on a new measure, derived from the correlation of empirical probability density distributions estimated using kernel es- timators. We term this the kernel density correlation (KDC) measure. The proposed framework takes the form of an en- ergy minimization formulation which is efficiently solved using the technique of variational optimization. We prove the convergence properties of the resultant iterative algo- rithm, and compare the performance of the proposed for- mulation to that of a state-of-the-art stereo registration ap- proach. KEY WORDS stereo, MRF, Parzen windows, density estimation, varia- tional optimization, dense disparity field. 1 Introduction Let I = {I l , I r } denote a pair of stereoscopic images, where I l ∈ S n l and I r ∈ S n r are the left and right im- ages, of sizes n l and n r , and taking values from the set S, respectively. The stereo registration problem can be de- fined as the identification of a map M : N l × N r →{0, 1} where N l and N r are the sets representing the lattice in- dices of the two images. For each ordered pair of pixels, (i l ,j r ), M(i l ,j r )=1 implies a match between the two pixels. The sets O l . = {i l : M(i l ,j r )=0 ∀j r ∈ N r } and O r . = {j r : M(i l ,j r )=0 ∀i l ∈ N l } represent the collections of occluded pixels in each image. A desirable criterion, arising from the notion that the real-world objects being imaged are opaque, can be represented as the condi- tions, ∑ j r ∈N r M(i l ,j r ) ≤ 1 and ∑ i l ∈N l M(i l ,j r ) ≤ 1 for all pixels. Seeking the map M over a large space, N l × N r , un- der the aforementioned constraints is a complex optimiza- tion problem. To make the problem of stereo registration more manageable, a common approach is to estimate the disparity function, D : N l → N r , as an intermediate so- lution to the stereo problem (instead of estimating the map M). Stereo matching algorithms which use this approach include [1–5]. The disparity function, D, is usually mod- elled using a pairwise Markov Random Field (MRF) to en- force spatial smoothness of the disparity map [3, 5]. While performing exact inference on MRFs is computationally in- tractable, approximate inference algorithms based on the use of graph cuts [4, 5], and belief propagation [3], have been shown to yield good performance for the two-view stereo matching problem [6]. The above approaches reduce the stereo matching problem to finding a disparity function defined on discrete range and domain spaces. There are several problems with this model. Firstly, the solution to the above problem would have a combinatorial complexity, and is often not solvable exactly. Secondly, the disparities produced are necessar- ily discrete and induce quantization effects in the recon- structed scene geometry. Thirdly, the problem thus defined is not well-posed. Specifically, the presence of occlusions implies that not every pixel in the left image can be associ- ated with a pixel in the right image (i.e., O l = ∅). In the present paper, we propose to alleviate the first two problems by seeking a continuous-valued dispar- ity function. The continuous-domain formulation allows the use of efficient continuous-domain optimization tech- niques. Further, this does not induce the aforementioned quantization artifacts. To alleviate the third problem, we propose a robust energy formulation that is relatively im- pervious to the restriction of the disparity function to the set of occluded values in the left image, i.e., D| O l . In Section 2, we propose a registration formulation that is robust to occluded pixels in the reference image. This is achieved by correlating the reference image and the target image in the pdf-domain. In Section 3, the stereo registration problem is formulated as an energy minimiza- tion framework with an MRF smoothness constraint on D. Then, in Section 4, we present an optimization framework that uses variational bounds on the energy functional. We also prove convergence of the proposed algorithm under mild conditions that are easily met. In Section 5, we then present results on standard test data sets and compare our results with the state-of-the-art available in the literature. Conclusions for the work are presented in Section 6. 1