Self-Calibration of Traffic Surveillance Camera using Motion Tracking
Tuan Hue Thi, Sijun Lu and Jian Zhang
Abstract— A statistical and computer vision approach us-
ing tracked moving vehicle shapes for auto-calibrating traffic
surveillance cameras is presented. Vanishing point of the traffic
direction is picked up from Linear Regression of all tracked
vehicle points. Preliminary straightening model is then built to
help collect statistics of the typical vehicle class traveling in
each particular scene. Analysis on this class eventually helps to
compute the complete calibration parameters. Results obtained
from the validation step against traditional methods in different
traffic locations demonstrate its desirable accuracy with much
more flexibility and reliability.
I. I NTRODUCTION
A. Motivation
In every computer vision problem, especially those in
traffic surveillance, knowledge about the camera is essen-
tial for the purpose of content retrieval. There have been
around many different techniques used for obtaining those
camera parameters, which is known as calibration process.
Common classification divides those techniques into manual
[10][8] and self-calibration [5][3]. The need for flexible and
cost-effective self-calibration techniques has brought to the
computer vision world many different interesting techniques,
and are mostly conceptualized around the line between
flexibility and accuracy. However, even with the current
popular methods [6] [3] for calibrating single camera in
traffic surveillance, they still in one way of the other rely
on certain assumptions about the scene location. In order
to breakthrough these flexibility constraints, we propose a
highly flexible self-calibration technique which can be used
in most traffic locations and has been proven to produce as
desirable results as existing state-of-the-art self-calibration
techniques do.
B. Related Works
Representative approach for manual camera calibration use
square patterns in 3D (by either different orthogonal planes
[8] or one plane rotating on three coordinate axes [10]).
These are the most accurate approach since all the freedom
degrees necessary for the transformation between image and
world are obtained. However, the high cost of manually
setting up the calibrating model makes these approaches
hardly applicable in practical traffic surveillance.
Self-calibration techniques are often seen to use vanishing
points available in the scene together with some image-
world analysis to calculate camera intrinsic and extrinsic
parameters [2]. These methods normally take different views
All authors are with the National ICT of Australia, 223 Anzac Parade,
Kensington NSW 2052, Australia (Jian Zhang; phone: +61 2 8306 0780;
fax: +61 2 8306 0403; email: jian.zhang@nicta.com.au).
of an architectural object to generate the sufficient vanishing
points in all dimensions. These approach, however, have very
limited application in traffic surveillance where the camera
is fixed and only limited knowledge about the traffic scene
is produced.
The technique described by Cathey and Dailey [3] appears
to solve these constraints of uncalibrated camera using van-
ishing point obtained from the lane straightness on highway
together with the stripe lane frequency and lane width. The
interesting idea from their method is the use of a rough value
of the panning angle to obtained the straightened model of
the road, which is in fact proven to be good enough for
generating the accurate estimation of the camera parameters.
The main disadvantage of this technique, however, is the
dependence on lane marking and existing stripe lane to
do the calculation. This makes this technique inapplicable
in surveillance locations where knowledge about the travel
lanes is unknown or the stripe lane is not present.
Bose and Grimson from their work in [1] introduced an
interesting idea of using object tracking and prior knowledge
about object dimensions to rectify images of the ground
plane. Although this work is not a complete calibration
process, it helps to bring up another way of solving image-
world transformation using object detection and tracking
from the image together with sufficient knowledge of the
object’s physical characteristics.
The complete camera calibration work using moving
objects in the scene was introduced in [4] by Fengjun
et al. in which they detect and track walking human on
different direction to derive three vanishing points of the
scene, together with measurement from images of human
at leg-crossing phases to derive the intrinsic and extrinsic
parameters of the camera. This method, however, cannot be
applied to our problem of traffic surveillance where normally
vehicles only travel on one line, and the knowledge about
the vehicles traveling on the road is much more vague than
human, where typically there are many kinds of vehicles with
different sizes running on the same road.
C. Our Approach
Description and experiment results of our proposed
method will be structured in this paper as followed. We
first detect motion blobs in the traffic scenes using motion
detection algorithm described in Wang et al. work [9]. Those
blobs will then be passed through several filters to eliminate
noises and smooth out the measurements, before being put
into a Linear Kalman Filter for tracking purpose. The history
of all these tracking records then helps us to produce one
trajectory for each single blob using Linear Regression. A
Proceedings of the 11th International IEEE
Conference on Intelligent Transportation Systems
Beijing, China, October 12-15, 2008
1-4244-2112-1/08/$20.00 ©2008 IEEE 304