Stereo Tracking using ICP and Normal Flow Constraint Louis-Philippe Morency & Trevor Darrell Artiﬁcial Intelligence Laboratory Massachusetts Institute of Technology Cambridge, Massachusetts 02139 http://www.ai.mit.edu @ MIT Motivation: The problem of estimating 3D rigid body motion has been studied extensively in the computer vision and graphics ﬁelds. The well-known Iterative Closest Point (ICP) algorithm, introduced by Chen and Medioni [4] and Besl and McKay [2], has been used extensively in the graphics literature to merge 3D laser range scans. In the vision literature much progress has been made on gradient-based parametric motion estimation techniques which aggregate pointwise normal ﬂow constraints [3, 8]. To date, most ICP algorithms have been tested on very precise 3D data sets from a laser scanners [9] or other range scanning methods. We are interested in tracking data from relatively noisy optical stereo range data at modest frame rates. Previous work: ICP ﬁnds corresponding points between two 3D point clouds and tries to minimize the error (usually the euclidian distance) between the matched points. Chen and Medioni minimize this error based on a point-to-plane distance, while Besl and McKay minimize the direct euclidian distance between the matched points (point-to-point). Rusinkiewicz and Levoy [10] present a extensive survey of many variants of ICP. Godin et al.[6] ﬁrst used color to ﬁlter matched points during ICP. While other methods [5, 11] have incorporated color information in the distance function of the matching process, no solution has been suggested that uses color/brightness during the error minimization process. The normal ﬂow is 3D vector ﬁeld which can be deﬁned as the component of the 2D optical ﬂow that is in the direction of the image gradient[13]. When 3D observations are directly available, such as from optical stereo or laser range ﬁnders, a normal ﬂow constraint can be expressed directly to estimate rigid body motion [12]. Harville et al.[7] combined normal ﬂow constraint with a depth gradient constraints to track rigid motion. Gradient-based approaches use color/brightness information during the minimization process and have proved to be accurate for sub-pixel movements[1]. Approach: We developed an integrated tracking approach which jointly aligns images using a normal ﬂow gra- dient constraint and an ICP algorithm. This new technique is more precise for small movements and noisy depth than ICP alone, and more robust for large movements than the normal ﬂow constraint alone. In our new framework, we integrate an ICP 3D euclidian error function with a normal ﬂow constraint, creating a hybrid registration error metric yielding a tracker which is both robust and precise. The ICP approach matches points in 4 dimensions (3D + brightness) and minimizes the euclidian distance between corresponding points. Empirically, we have found that ICP robustly handles coarse motion. The NFC (Normal Flow Constraint) approach matches points based on the inverse calibration parameters and ﬁnd the transformation between corresponding points based on their appearance and their 3D position. This method is more precise for small movement since it searches the pose parameter space using a gradient method which can give sub-pixel accuracy. Figure 1 shows how our hybrid tracker iterates a joint error minimization process until convergence. At each iteration two error function are minimized in the same linear system. Closest Point Point-to- Plane Inverse Calibration Normal Flow Minimize Check ICP NFC ICP + NFC t ICP NFC Warp (1- ) Figure 1: Hybrid tracker structure. 400