Self-Calibration of Trafﬁc Surveillance Camera using Motion Tracking Tuan Hue Thi, Sijun Lu and Jian Zhang Abstract— A statistical and computer vision approach us- ing tracked moving vehicle shapes for auto-calibrating trafﬁc surveillance cameras is presented. Vanishing point of the trafﬁc direction is picked up from Linear Regression of all tracked vehicle points. Preliminary straightening model is then built to help collect statistics of the typical vehicle class traveling in each particular scene. Analysis on this class eventually helps to compute the complete calibration parameters. Results obtained from the validation step against traditional methods in different trafﬁc locations demonstrate its desirable accuracy with much more ﬂexibility and reliability. I. I NTRODUCTION A. Motivation In every computer vision problem, especially those in trafﬁc surveillance, knowledge about the camera is essen- tial for the purpose of content retrieval. There have been around many different techniques used for obtaining those camera parameters, which is known as calibration process. Common classiﬁcation divides those techniques into manual [10][8] and self-calibration [5][3]. The need for ﬂexible and cost-effective self-calibration techniques has brought to the computer vision world many different interesting techniques, and are mostly conceptualized around the line between ﬂexibility and accuracy. However, even with the current popular methods [6] [3] for calibrating single camera in trafﬁc surveillance, they still in one way of the other rely on certain assumptions about the scene location. In order to breakthrough these ﬂexibility constraints, we propose a highly ﬂexible self-calibration technique which can be used in most trafﬁc locations and has been proven to produce as desirable results as existing state-of-the-art self-calibration techniques do. B. Related Works Representative approach for manual camera calibration use square patterns in 3D (by either different orthogonal planes [8] or one plane rotating on three coordinate axes [10]). These are the most accurate approach since all the freedom degrees necessary for the transformation between image and world are obtained. However, the high cost of manually setting up the calibrating model makes these approaches hardly applicable in practical trafﬁc surveillance. Self-calibration techniques are often seen to use vanishing points available in the scene together with some image- world analysis to calculate camera intrinsic and extrinsic parameters [2]. These methods normally take different views All authors are with the National ICT of Australia, 223 Anzac Parade, Kensington NSW 2052, Australia (Jian Zhang; phone: +61 2 8306 0780; fax: +61 2 8306 0403; email: jian.zhang@nicta.com.au). of an architectural object to generate the sufﬁcient vanishing points in all dimensions. These approach, however, have very limited application in trafﬁc surveillance where the camera is ﬁxed and only limited knowledge about the trafﬁc scene is produced. The technique described by Cathey and Dailey [3] appears to solve these constraints of uncalibrated camera using van- ishing point obtained from the lane straightness on highway together with the stripe lane frequency and lane width. The interesting idea from their method is the use of a rough value of the panning angle to obtained the straightened model of the road, which is in fact proven to be good enough for generating the accurate estimation of the camera parameters. The main disadvantage of this technique, however, is the dependence on lane marking and existing stripe lane to do the calculation. This makes this technique inapplicable in surveillance locations where knowledge about the travel lanes is unknown or the stripe lane is not present. Bose and Grimson from their work in [1] introduced an interesting idea of using object tracking and prior knowledge about object dimensions to rectify images of the ground plane. Although this work is not a complete calibration process, it helps to bring up another way of solving image- world transformation using object detection and tracking from the image together with sufﬁcient knowledge of the object’s physical characteristics. The complete camera calibration work using moving objects in the scene was introduced in [4] by Fengjun et al. in which they detect and track walking human on different direction to derive three vanishing points of the scene, together with measurement from images of human at leg-crossing phases to derive the intrinsic and extrinsic parameters of the camera. This method, however, cannot be applied to our problem of trafﬁc surveillance where normally vehicles only travel on one line, and the knowledge about the vehicles traveling on the road is much more vague than human, where typically there are many kinds of vehicles with different sizes running on the same road. C. Our Approach Description and experiment results of our proposed method will be structured in this paper as followed. We ﬁrst detect motion blobs in the trafﬁc scenes using motion detection algorithm described in Wang et al. work [9]. Those blobs will then be passed through several ﬁlters to eliminate noises and smooth out the measurements, before being put into a Linear Kalman Filter for tracking purpose. The history of all these tracking records then helps us to produce one trajectory for each single blob using Linear Regression. A Proceedings of the 11th International IEEE Conference on Intelligent Transportation Systems Beijing, China, October 12-15, 2008 1-4244-2112-1/08/$20.00 ©2008 IEEE 304