1 Efficient Data Association in Images using Active Matching Margarita Chli and Andrew J. Davison {mchli, ajd}@doc.ic.ac.uk Department of Computing Imperial College London London SW7 2AZ, UK Abstract— In the feature matching tasks which form an in- tegral part of visual tracking or SLAM, there are invariably priors available on the absolute and/or relative image locations of features of interest. Usually, these priors are used post-hoc in the process of resolving feature matches and obtaining final scene estimates, via ‘first get candidate matches, then resolve’ consensus algorithms such as RANSAC or JCBB. In this paper we show that the dramatically different approach of using priors dynamically to guide a feature by feature matching search can achieve global matching with much fewer image processing operations and lower overall computational cost. Essentially, we put image processing into the loop of the search for global consensus. In particular, our approach is able to cope with significant image ambiguity thanks to a dynamic mixture of Gaussians treatment. In our fully Bayesian algorithm, the choice of the most efficient search action at each step is guided intuitively and rigorously by expected Shannon information gain. We demonstrate the algorithm in feature matching as part of a sequential SLAM system for 3D camera tracking. Robust, real-time matching can be achieved even in the previously unmanageable case of jerky, rapid motion necessitating weak motion modelling. I. I NTRODUCTION It is well known that the key to obtaining correct feature associations in potentially ambiguous matching (data associa- tion) tasks is to search for a set of correspondences which are in consensus: they are all consistent with a believable global hypothesis. The usual approach taken to search for matching consensus is as follows: first candidate matches are generated, for instance by detecting all features in both images and pairing features which are nearby in image space and have similar appearance. Then, incorrect ‘outlier’ matches are pruned by proposing and testing hypotheses of global parameters which describe the world state of interest — the 3D position of an object or the camera itself, for instance. The sampling and voting algorithm RANSAC [6] has been widely used to achieve this in geometrical vision problems. The idea that inevitable outlier matches must be ‘rejected’ from a large number of candidates achieved by some blanket initial image processing is deeply entrenched in computer vision and robotics. The approach of our active matching paradigm is very different — to cut outliers out at source wherever possible by searching only the parts of the image where true positive matches are most probable. Instead of searching for all features and then resolving, feature searches occur one by one. The results of each search, via an exhaustive but concentrated template checking scan within a region, (a) Fast camera motion at 15Hz (b) Slow camera motion at 15Hz Fig. 1. Active matching dramatically reduces image processing operations while still achieving global matching consensus. Sequence results: superpo- sition of the green individual gating ellipses searched in order to generate candidates for outlier rejection by JCBB and the yellow ellipses searched for our Active Matching [1] method. In these frames, joint compatibility needed to search a factor of 8.4 more image area than active matching in (a) and a factor or 4.8 in (b). JCBB must resolve all the matches shown (blobs), whereas Active Matching only finds the yellow blobs. affect the regions within which it is likely that each of the other features will lie. This is thanks to the same inter- feature correlations of which standard consensus algorithms take advantage — but our algorithm’s dynamic updating of these regions within the matching search itself means that low probability parts of the image are never examined at all (see Figure 1), and the number of image processing operations required to achieve global matching is reduced by a large factor. Information theory intelligently guides the step by step search process from one search region to the next and can even indicate when matching should be terminated at a point of diminishing returns. Davison [4] presented a theoretical analysis of information gain in sequential image search. However, Davison’s work had the serious limitation of representing the current estimate of the state of the search at all times with a single multi-variate Gaussian distribution. This meant that while theoretically and intuitively satisfying active search procedures were demon- strated in simulated problems, the technique was not applicable to real image search because of the lack of ability to deal with discrete multiple hypotheses which arise due to matching ambiguity — only simulation results were given. Here we use a dynamic mixture of Gaussians (MOG) representation which grows as necessary to represent the discrete multiple hypotheses arising during active search. We show that this representation can now be applied to achieve highly efficient image search in real, ambiguous tracking problems. Matching constraints are obtained by projecting an uncertain