Multi-Camera person tracking using an extended Kalman Filter B.Merven, F.Nicolls and G.de Jager Department of Electrical Engineering, University of Cape Town, Private Bag, Rondebosch, 7701, South Africa, Email: {bruno,nicolls,gdj}@dip.ee.uct.ac.za Abstract This paper presents some ‘work in progress’ towards a multi-camera person tracking solution. The tracking system aims to combine observations obtained from one or more cameras and a simple motion model to optimally estimate the location of a person in the mon- itored scene using an Extended Kalman Filter. Obser- vations are made in image space but the tracking takes place in world coordinates using the cameras’ calibra- tion information. The novelty of this implementation lies in the way the observations used by the Kalman filter are obtained. The observation in image space is done by finding the best match with a RGB-height his- togram assuming an elliptical shape for the person. At this stage, once the system is initialised, a person can be tracked using two cameras at 4-5 frames per second in a Matlab implementation that is robust to prolonged partial occlusions in either or both views at the same time. Although further testing is required this imple- mentation looks promising. Keywords: Person tracking, Kalman filter 1 Introduction Robust person tracking in real-time presents quite a difficult task that, if solved would find applications in surveillance and monitoring. There are different ap- proaches to solving the problem, each one, making different assumptions about the tracked objects, the scenes and whether cameras are static or not. There is no real elegant solution at this point that has the speed and robustness that surveillance system requires. However tackling the tracking problem as a probabilis- tic estimation problem seems to be, judging from the literature, the most promising. Particle filters imple- mentations such as ones by [2] and [6] and Kalman fil- ter implementations such as ones by [1] and [10] seem to be the ones with most success. In this implementation the tracker/estimator is an Extended Kalman Filter. The Kalman Filter is of- ten referred to as an optimal estimator [5]. It is op- timal in the sense that is lends itself very well to the problem of combining multiple observations and a dy- namic model. The Extended Kalman Filter enables es- timations when there are non-linearities in the way the observations and the system dynamics relate. This is done at relatively low computational cost compared with the particle filter, where more samples are re- quired to be evaluated. The other main divide in the different approaches is whether the tracking takes place in the 2-D image space (e.g. [1]) or in a 3-D world-view (e.g.[10]). The second approach is suitable when the cameras are static and calibration information is available. Since this is the case for the tracking problem tackled here, we can track in 3-D. This offers several advantages, namely: (i) Motion models with various constraints are eas- ier to construct in world coordinates; (ii) Occlusions are easier to resolve; (iii) The definition of a common coordinate system in the case of multi-camera tracking configurations is made simpler. A person being tracked is assumed to be a 3D ellipsoid of known size with his/her feet on the ground plane. The ellipsoid is chosen because it is always projected onto image space as an ellipse, making things simpler. Observations in the image plane of each of the cam- eras are taken by comparing ellipse shaped image sam- ples with available models for each of the tracked sub- jects. The two most common approaches are: colour 45