OPTIMIZATION BASED IMAGE REGISTRATION IN THE PRESENCE OF MOVING OBJECTS F. Karimi Nejadasl 1 , B. G. H. Gorte 1 , Serge P. Hoogendoorn 2 , and M. Snellen 1 1 Institute of Earth Observation and Space System Delft University of Technology Kluyverweg 1, 2629 HS, Delft, The Netherlands Tel: +31 15 27 88337, Fax: +31 15 27 82348 F.KarimiNejadasl@tudelft.nl,B.G.H.Gorte@tudelft.nl 2 Transport and planning section Delft University of Technology Stevinweg 1, 2628 CN, Delft, The Netherlands S.P.Hoogendoorn@tudelft.nl KEY WORDS: Registration, Optimization, Differential Evolution, Nelder-Mead, 3D Euclidean ABSTRACT: To increase robustness in registration of image sequences, we investigate a featureless method. This paper formulates the registration problem as an optimization of an energy function between a reference image and a transformed of target image. A result parameters are estimated using a global optimizer, Differential Evolution, followed by a local optimizer, Nelder-Mead. Our experiments show that the proposed algorithm perform, robustly in a large variety of image content from the road almost empty surrounding to more cluttered one and from simple road shape to more complex. 1 INTRODUCTION This paper describes an algorithm to co-register images in an image sequence that is recorded from a non-stable platform, in this case a helicopter hovering above a highway. Such image sequences are used to collect statistics concerning the driver be- havior in busy (nearly congested) traffic. Typically, we record highway stretches with a length of 300-500m during one hour or more. We use a b/w camera with 1300*1030 pixels, which gives a ground resolution of approx. 25-40 cm, at a frame rate of 10 fps. Because of the large data volumes, we aim at fully automatic image analysis, which means that for each vehicle in the scene we record the position on the highway as a function of time (in 0.1s increments). The accuracy and precision of the recordings have to be such that reliable estimates can be derived for the speeds, ac- celerations/decelerations and reaction times (when does a driver start to brake after its predecessor does?). An important step is the co-registration of all images in the se- quence. Turbulence generated by a helicopter that is hovering at one position causes random movements leading to severe deterio- ration of platform stability. The pilot has the difficult task to pre- vent the helicopter from (slowly) drifting away from the wanted position, and she/he definitely cannot control random movements, that cause misalignments of images in the sequence. From a typical flying height in the order of 400m and with about 55 degree viewing angle, consecutive images (recorded at 0.1s intervals) show misalignments of up to 3m. Misalignment is caused by the combined effects of helicopter translations (in x, y and z directions) and rotations (around the x, y and z axes). Using a Gyro stabilizer on a camera dampens out a rotor or motor vibration in turbulence caused by helicopter during a hovering time. However, it cannot provide the stabilized image sequence over a long time, for example half an hour, due to accumulation of movements with increasing time. Moreover, the Gyro stabilizer cannot prevent effects of slower rotation or (any) translation. Therefore we opted for a software post-processing solution to obtain fully automatic co-registration. In this solution the image is processed frame by frame, starting from a reference frame, which (for simplicity) we will assume to be frame 1 in the sequence. Assuming that frames 2 ··· n are al- ready registered to frame 1, which means that the transformation T i,1 between frame i and frame 1 is known, we process frame i +1 in three steps as follows: 1. compute T i+1,i , the transformation between image intensity I i+1 and image intensity I i 2. compute Ei+1,1 = Ti,1Ti+1,i , the estimate transformation between I i+1 and I 1 3. used Ei+1,1 as the initial value for computing Ti+1,1 This strategy prevents registration errors to accumulate. Match- ing consecutive images (step 1) is easier (i.e. less error-prone) than matching arbitrary images, since the misalignment is lim- ited. In step 3, this problem is avoided by providing an accurate approximate value to the matching process. The first step is still the most demanding one and is the main focus of the paper: how to register consecutive images from a sequence. Usually, the transformation between two images is calculated based on common features, which have to be identified first by using, for example, an interest operator such as the F¨ orstner operator or SIFT (Lowe, 2004). In our case there are moving objects in the scene that confuse this process; the common features only should be selected from fixed objects. Automatic techniques require a good distribution of features and moreover, classification of moving and fixed objects (Kang et al., 2005), (Pless et al., 2000). Having a good feature distribution is highly dependent on image content.