Adaptive Multiple Object Tracking Using Colour and Segmentation Cues Pankaj Kumar, Michael J. Brooks, and Anthony Dick University of Adelaide School of Computer Science South Australia 5005 pankaj.kumar@adelaide.edu.au, michael.brooks@adelaide.edu.au, anthony.dick@adelaide.edu.au Abstract. We consider the problem of reliably tracking multiple objects in video, such as people moving through a shopping mall or airport. In or- der to mitigate difficulties arising as a result of object occlusions, mergers and changes in appearance, we adopt an integrative approach in which multiple cues are exploited. Object tracking is formulated as a Bayesian parameter estimation problem. The object model used in computing the likelihood function is incrementally updated. Key to the approach is the use of a background subtraction process to deliver foreground segmen- tations. This enables the object colour model to be constructed using weights derived from a distance transform operating over foreground regions. Results from foreground segmentation are also used to gain im- proved localisation of the object within a particle filter framework. We demonstrate the effectiveness of the approach by tracking multiple ob- jects through videos obtained from the CAVIAR dataset. 1 Introduction Reliably tracking multiple objects in video remains a highly challenging and unsolved problem. If, for example, we aim to track several people in an airport or shopping mall, we face difficulties associated with appearance and scale changes as each person moves around. Compounding this are occlusion problems that can arise when people meet or pass by each other. This paper is concerned with improving the reliability of multiple object tracking in surveillance video. Visual tracking of multiple objects is formulated in this work as a parameter estimation problem. Parameters describing the state of the object are estimated using a Bayesian technique where the constraints of Gaussianity and linearity do not apply. In Bayesian estimation, the posterior probability density function (pdf) p(X t |Z T ) of the state vector X t given a set of observations Z T obtained from the camera is computed at every step, as new observations become available. Many tracking algorithms with a fixed object model have already been designed [1], [2]. However, trackers with a fixed object model are typically unable to track objects for long because of changes in lighting conditions, pose, scale and view point and also due to camera noise. Y. Yagi et al. (Eds.): ACCV 2007, Part I, LNCS 4843, pp. 853–863, 2007. c Springer-Verlag Berlin Heidelberg 2007