Fast and Robust 2D Parametric Image Registration Eun-Young Elaine Kang, Isaac Cohen and Gérard Medioni Institute for Robotics and Intelligent Systems School of Engineering University of Southern California {elkang, icohen, medioni}@usc.edu Abstract Camera stabilization is an important task for video analysis. In this report, we describe a method for estimating and stabilizing camera motion between two images via 2D image registration. Our approach is based on a pair-wise registration of images parameterized by 2D affine or projective transformation. Pair-wise image registration estimates transformation parameters for each pair of consecutive frames, and the transformation between arbitrary frames is computed by the concatenation of the pair-wise transformations. Pair-wise registration is based on 1) hierarchical parameter estimation and refinement 2) feature-matching 3) FFT(Fast Fourier Transformation)-based global matching 4) RANSAC-based parameter estimation. 1. Introduction Stabilizing camera motion is a necessary task for video processing, such as video coding and video surveillance. This process enables us to focus on other scene analysis (eg. detecting and tracking moving objects). Here, we describe a fast and robust method to stabilize camera motion between two images. In general, motion stabilization for an arbitrary camera swath involves two technical issues: 1) robust and fast computation and 2) global registration. The first issue occurs due to the fact that many video processing applications demand real-time throughput. The second issues is because many video analysis tasks need to process multiple frames, which may not be consecutive in time, and global registration is required to process multiple frames seamlessly. We proposed a novel approach to solve these issues in [Kang00]. Per many readers’ requests, in this report, we describe the technical details of our approach attacking the first issue, which is missing in the previous papers. We achieve camera motion stabilization by a 2D image registration technique. The registration process is performed pair-wise. Pair-wise registration recovers the transformation for each pair of consecutive video frames. The method is based on feature-based hierarchical parameter estimation. It also uses the initial estimation based on a global matching in frequency domain and RANSAC outlier removal. 2D image registration methods have been intensively studied for the past several years in the academia and industry. Most of the methods use parametric approaches, which recover affine or projective or higher order, such as quadratic, parametric models by direct or feature-based error measurement. In the parametric approach, the initial parameters are estimated in various ways. And then, the parameters are iteratively refined by reducing and minimizing the errors between images. These errors are measured by considering all image intensities or features only. The former is called direct method and the latter feature-based method. McMillan et al. generate a panoramic view for a rotating fixed camera and represent the images with a plenoptic function with respect to the camera center [McMillan95]. Moreover, they generate a view for a virtual viewpoint from a set of constructed panoramic view and simulate the depth by re-sampling the plenoptic function. Szeliski et al. also focus on the mono-centric mosaic, which has a fixed camera rotation axis, and show some extension for 8-parameter recovery [Szeliski94] [Szeliski97]. The method recovers the rotation parameters with focal length or 8-parameters by using the Levenberg Marquardt method. The limitation of McMillan’s work is that it can only create mono-centric panoramic views and requires well-controlled camera system. Szeliski’s work suffers from the local minima problem in case of large interframe motion. Irani et al. recover the 2D quadratic transformation based on the direct image matching [Irani93] [Irani95] [Kumar95] [Irani96]. They also exploit the parallax information to process residual information after the image alignment resulting in 3D corrected mosaic images. Morimoto et al. address how to extract the dominant motion between two frames based on a 2D parametric model [Morimoto97] [Morimoto98]. In [Morimoto97], they employ 3D model based stabilization using Kalman filter to estimate the rotation between frames, and they de-rotate the camera sequence to compose mosaic. In