A Global Solution to Sparse Correspondence Problems Joa ˜o Maciel and Joa ˜o P. Costeira Abstract—We propose a new methodology for reliably solving the correspondence problem between sparse sets of points of two or more images. This is a key step in most problems of computer vision and, so far, no general method exists to solve it. Our methodology is able to handle most of the commonly used assumptions in a unique formulation, independent of the domain of application and type of features. It performs correspondence and outlier rejection in a single step and achieves global optimality with feasible computation. Feature selection and correspondence are first formulated as an integer optimization problem. This is a blunt formulation, which considers the whole combinatorial space of possible point selections and correspondences. To find its global optimal solution, we build a concave objective function and relax the search domain into its convex-hull. The special structure of this extended problem assures its equivalence to the original one, but it can be optimally solved by efficient algorithms that avoid combinatorial search. This methodology can use any criterion provided it can be translated into cost functions with continuous second derivatives. Index Terms—Correspondence problem, linear and concave programming, sparse stereo. æ 1 INTRODUCTION E STIMATING feature correspondences between two or more images is a long standing fundamental problem in computer vision. Most methods for 3D reconstruction, object recognition, and camera self-calibration start by assuming that image feature-points were extracted and put to correspondence. This is a key problem and, so far, no general reliable method exists to solve it. There are three main difficulties associated with this problem. First, there are no general constraints to reduce its ambiguity. Second, it suffers from high complexity due to the huge dimension- ality of the combinatorial search space. Finally, the existence of outliers must be considered since features can be missing or added through a sequence of images, due to occlusions and errors of the feature extraction procedure. 1.1 Overview of Correspondence Methods Correspondence can be interpreted as an optimization problem. Each method translates the assumptions into an objective function—criterion—and a set of constraints. Constraints are conditions that must be strictly met. Examples are order [19], [22], epipolar constraint [19], [22]—rigidity as a constraint—uniqueness [7], visibility [25], and proximity. Tracking-like algorithms [13] impose strict proximity constraints so they should be considered as continuous-time methods. The objective function reflects a condition that can be relaxed, but which value should be optimized. The most commonly used objective function is image correlation [28], [13], [16]—image similarity assump- tion. Other usual choices are point proximity [13], [30] or smoothness of disparity fields [19]. Finally, correspondence algorithms differ also in the computational framework used to solve optimization problems. Dynamic programming [19], graph search [22], bipartite graph matching [6], and convex minimization [13] guarantee optimality. Nonopti- mal approaches include greedy algorithms [29], simulated annealing [26], relaxation [7], alternating optimization and constraint projection [1], and randomized search [28]. Vision systems often have to deal with the existence of spurious features and occlusions. Algorithms that explicitly handle these situations are more likely to behave robustly. The work in [28] presents a pruning mechanism that performs outlier rejection in sets of previously matched features. 2 CORRESPONDENCE AS AN OPTIMIZATION PROBLEM We formulate the correspondence problem as an integer optimization problem in a generic sense. In other words, it can handle most of the commonly used assumptions using one single formalism. Both problems of feature selection and correspondence were designed as one single optimiza- tion problem so both tasks are performed in an integrated way. Furthermore, its global solution can be found avoiding combinatorial search without having to impose additional assumptions. We do so by relaxing the discrete search domain into its convex-hull. The special structure of the constraints and objective function assure that the relaxation is exact so the result is an equivalent problem that can be optimally solved by efficient algorithms. For the sake of simplicity, we start with the two-image case; however, the extension to sequences is discussed in Section 2.10. 2.1 Problem Formulation Consider the images of a static scene shown in Fig. 1. 1 Segment p 1 represents feature-points on the first image and p 2 on the second—the white dots. Some of these are projections of the same 3D points. Arrange their representa- tions in two matrices X and Y as IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, VOL. 25, NO. 2, FEBRUARY 2003 1 . The authors are with the Instituto de Sistemas e Robo´tica–Instituto Superior Te´cnico, Av. Rovisco Pais, 1049-001 Lisboa, Portugal. E-mail: {maciel, jpc}@isr.ist.utl.pt. Manuscript received 29 Nov. 2000; revised 30 Nov. 2001; accepted 10 May 2002. Recommended for acceptance by D. Fleet. For information on obtaining reprints of this article, please send e-mail to: tpami@computer.org, and reference IEEECS Log Number 113212. 1. Data was provided by the Modeling by Video group in the Robotics Institute at CMU. 0162-8828/03/$17.00 ß 2003 IEEE