TOPOLOGY-INDEPENDENT REGION TRACKING WITH LEVEL SETS Abdol-Reza Mansouri, Antoine Olivier, and Janusz Konrad INRS-T´ el´ ecommunications, Institut National de la Recherche Scientifique Place Bonaventure, P.O. Box 644, Montr´ eal, Qu´ ebec, Canada, H5A 1C6 In Proc. Int. Conf. on Image Processing, ICIP-2000 Sep. 10-13, 2000, Vancouver, BC, Canada c 2000 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE. ABSTRACT This paper presents a new approach to the tracking of regions in an image sequence. Unlike most other methods, the proposed approach can handle topology changes, i.e., regions may split or merge. This flexibility is naturally embedded into a partial differ- ential equation that solves a minimum description length (MDL) estimation problem. The basic estimation criterion consists of only two terms: the description length of the region shape mismatch and the description length of the region itself, but we show possible ex- tensions to this basic formulation. We minimize the MDL criterion using the level set methodology that inherently accounts for topol- ogy changes. We show results for natural data with natural as well as synthetic motion. 1. PROBLEM STATEMENT We address the problem of tracking an arbitrary region in a se- quence of images in a topology-independent fashion. Such a prob- lem is of importance in applications ranging from region-based video coding (e.g., MPEG-4) to video surveillance and video data- base search. To date tracking of image regions has been addressed using a variety of approaches. In the simplest one, the region to be tracked is assumed to undergo simple translational motion from every frame to the next in the sequence. In this case, techniques based on cross-correlation are sufficient to track the region of in- terest. However, due to the motion model used, the region cannot change shape between frames. In a more sophisticated approach, the region to be tracked and its motion are both defined parametri- cally, and a number of feature points on the region are tracked from frame to frame using simple correlation-based measures. With enough feature points, the region of interest can be tracked reli- ably. Another approach views region tracking as a by-product of boundary detection; it is assumed that the region displacement is small, and that the region has a large contrast against the back- ground. Region tracking is then performed using edge detection. In all of these approaches, the region to be tracked retains its topology; if it is connected, it stays connected throughout the tracking process, and conversely, if it is made up of disjoint con- nected subsets, it will remain so throughout. Only recently has a topology-independent approach to tracking been proposed [1]. In this approach, the region to be tracked is represented as the posi- tive part of some smooth function (and hence its boundary as the zero level set of that function), and the tracking problem reduces to solving a particular set of coupled partial differential equations, one of which computes motion flow. Although the algorithm is topology-independent, its major drawback is that it cannot use a This work was supported by the Natural Sciences and Engineering Research Council of Canada under Strategic Grant STR224122. pre-computed motion field for tracking and uses instead a simple partial differential equation to deform two images into one another. Such a procedure may work well for very small range of motion, but is bound to fail as soon as the motion becomes significant. In this paper, we first consider topology-independent region tracking based on a pre-computed motion field, and then present an algorithm for joint motion estimation and tracking in image se- quences. The regions to be tracked need not have any distinguish- ing features, and furthermore can have arbitrary shapes and can freely vary their topology over the course of the sequence. We formulate the problem in terms of minimum description length (MDL) estimation. The cost functional is expressed as a precise description length measure and is minimized over the space of re- gions with smooth boundary. The necessary conditions for a min- imum, i.e., for a solution to our tracking problem, are given by partial differential equations. We formulate the level set equiva- lent of these partial differential equations, in which each region to be tracked corresponds to the support of the positive part of a real-valued function. The level set representation provides the ca- pability to handle variations in region topology and allows regions to split and merge during tracking. 2. PROPOSED APPROACH Let {In,In+1} be images at time instants n and n +1, with com- mon domain Ω (an open subset of R 2 ). We assume that an estimate of the motion transformation φ + relating In to In+1 (forward mo- tion), as well as that of φ − relating In+1 to In (backward motion) are both known. We thus assume In( p)= In+1(φ + ( p)), ∀ p ∈ Ω and In+1( p)= In(φ − ( p)), ∀ p ∈ Ω. The assumption that φ + and φ − are invertible is not a strong assumption since we are assuming that our images are defined over the real plane or an open subset thereof. This will allow us to analytically derive partial differential equations that we will later discretize. 2.1. Formulation of tracking as MDL estimation Let R0 ⊂ Ω be a region in the image at time n (In) and let R⊂ Ω be the same (unknown) region that we seek in the image at time n +1 (In+1). It is not sufficient, in general, to perform a forward projection of R0, i.e., to compute φ + (R0), since the latter may be highly irregular. We thus wish to approximate φ + (R0) using as regular a region as possible. In what follows, the notion of region regularity will be described and quantified. Using L to denote a description length function, the minimum description length estimate R of the region R given the flow φ and the region R0 is obtained by minimizing the description length function L(R/φ + (R0)) with respect to R. We have L(R/φ + (R0)) = L(φ + (R0)/R)+ L(R), (1)