ROBUST VEHICLE TRACKING IN VIDEO IMAGES BEING TAKEN FROM A HELICOPTER Fatemeh Karimi Nejadasl, Ben G.H. Gorte, and Serge P. Hoogendoorn Institute of Earth Observation and Space System, Delft University of Technology, Kluyverweg 1, 2629 HS, Delft, The Netherlands f.KarimiNejadasl, b.g.h.gorte@tudelft.nl Transport and planning section, Delft University of Technology, Stevinweg 1, 2628 CN, Delft, The Netherlands S.P.Hoogendoorn@tudelft.nl Commission VII KEY WORDS: Optical Flow, Tracking, Feature detection, Matching, Region based, Feature based ABSTRACT: Measuring positions, velocities and accelerations/decelerations of individual vehicles in congested traffic with standard traffic moni- toring equipment, such as inductive loops, are not feasible. The behavior of drivers in the different traffic situations, as re-quired for microscopic traffic flow models, is still not sufficiently known. Remote sensing and computer vision technology are recently being used to solve this problem. In this study we use video images taken from a helicopter above a fixed point of the highway. We address the problem of tracking the movement of previously detected vehicles through a stabilized video sequence. We combine two approaches, optical flow and matching based tracking, improve them by adding constraints and using scale space. Feature elements, i.e. the corners, lines, regions and outlines of each car, are extracted first. Then, optical-flow is used to find for each pixel in the interior of a car the corresponding pixel in the next image, by inserting the brightness model. Normalized cross correlation matching is used at the corners of the car. Different pixels are used for solving the aperture problem of optical flow and for the template matching area: neighboring pixel and feature pixels. The image boundary, road line boundaries, maximum speed of the car, and positions of surrounding cars are used as constraints. Ideally, the result of each pixel of a car should give the same displacement because cars are rigid objects. 1. INTRODUCTION Traffic congestion is an important problem in modern society. A lot of money and time is wasted in traffic jams. Car crashes and accidents are more frequent during busy traffic conditions. Sev- eral efforts are made to tackle this problem: better facilities and regulations should improve the situation on existing roads while the number of the roads is extended as well. Traffic congestion is highly dependent on the behavior of indi- vidual drivers. For example, reaction times and lane-changing techniques vary from driver to driver. Therefore it is useful to model the behavior of individual drivers, as well as the interac- tion between drivers, before new decisions and regulations for traffic congestion control are initiated. Current traffic theories are not yet able to correctly model the behavior of drivers during con- gested or nearly congested traffic flow, taking individual driver’s behavior into account. For this so-called microscopic traffic mod- els are needed. Vast amounts of data are required to set up those models and determine their parameters. Traffic parameter extraction with airborne video data is recently getting popular. Automatic extraction of traffic parameters is a computer vision task. For traffic parameter extraction, informa- tion about each vehicle is needed during the period of time the vehicle is present in the scene. A possible solution is to detect a vehicle in a video frame when it enters the scene and then track it in successive frames. The video is recorded by a camera mounted on a helicopter. Since we want to model the behavior of as many vehicles (drivers) as possible, we attempt to cover a large highway section, leading to the lowest spatial resolution that accuracy requirements allow. Typically we use a spatial resolution (pixel size) between 25 and 50 cm. Helicopter movement invokes camera motion in addition to ob- ject (i.e. vehicle) motion. We have removed camera motion with the method describes in (Hoogendoorn et al. 2003) and (Hoogen- doorn et al. 2003). Unwanted areas outside the road boundary are eliminated by (Gorte et al. 2005). In earlier work, vehicles were detected by a difference method (Hoogendoorn et al. 2003), which requires involvement of an operator when automatic detection fails. This is often the case with cars having low contrast against the background (dark cars on a dark road surface). We used cross correlation matching for tracking. This works well in the case of distinct features with homogeneous movements. In this case it is less sensitive to the illumination change. However it is too sensitive to similarities in texture or brightness. To improve the performance of tracking, we investigate the use of optical flow methods in this paper. Improvement with respect to least square matching (Atkinson 1996) is expected because of the additional time element in the optical flow equation. Optical flow method is sensitive to small (even sub-pixel) move- ments. This sensitivity may be helpful for tracking cars that are similar to the background. The paper is organized as follows. In section 2. we present re- lated work. Section 3. discusses zero cross correlation match- ing method, in section 4. gradient based optical flow method by assumption of constraint and linear model of brightness is dis- cussed. Feature selection and constraints are described in the re- sult redundancy exploitation section. We give results in section 6. and conclusions in section 7.. 2. RELATED WORK Automatic object tracking receives attention in computer vision for a very diverse range of applications. Matching methods are largely used in video tracking. As men- tioned earlier, they are quit good in distinctive objects. However ISPRS Commission VII Mid-term Symposium "Remote Sensing: From Pixels to Processes", Enschede, the Netherlands, 8-11 May 2006 528