Vehicle Detection and Tracking by Collaborative Fusion Between Laser
Scanner and Camera
Dominique Gruyer, Aur´ elien Cord and Rachid Belaroussi
Abstract— This paper presents a new approach to fuse 3D
and 2D information in a driver assistance setup, in particular
to perform obstacle detection and tracking. We propose a new
cooperative fusion method between two exteroceptive sensors:
it is able to address highly non linear dynamic configuration
without any assumption on the driving maneuver. Information
are provided by a mono-layer laser scanner and a monocular
camera which are unsynchronized. The initial detection stage is
performed using the 1D laser data, which computes clusters of
points which might correspond to vehicles present on the road.
These clusters are projected to the image to define targets,
which will be tracked using image registration techniques.
This multi-object association and tracking scheme is imple-
mented using belief theory integrating temporal and spatial
information, which allows the estimation of the dynamic state
of the tracks and to monitor appearance and disappearance of
obstacles. Accuracy of the method is evaluated on a database
made publicly available, focus is cast on the relative localization
of the vehicle ahead: estimations of its longitudinal and lateral
distances are analysed.
I. I
For many on-board automotive driver assistance systems
DAS (such as collision avoidance, blind spot monitoring,
adaptive cruise control, or parking assistant), robust and
reliable vehicle detection is a critical step. On-road vehicle
detection concerns systems where sensors are mounted on
the vehicle rather than being fixed on the infrastructure such
as cameras for traffic monitoring systems [1].
The most common vehicle detection systems are using
active sensors: laser, radar or sonar. Such sensors detect
the distance of objects by measuring the travel time of a
signal they emitted after its reflection by the object. Laser
scanners are popular sensors for such a purpose [2], [3]:
they are usually mounted on the front bumper and per-
form a horizontal scanning; objects are detected on a given
horizontal plane (mono-layer). The data coming from laser
scanner are easier to cluster than radar and they are more
accurate. Moreover, it is easier to quantify the reliability
and to model the uncertainties of such data. However, laser
sensors fail to overcome some situations such as non-planar
road configuration, or a varying pitch angle due to the ego-
vehicle maneuver depending on an acceleration or road shape
variations (turns, road bumps . . . ). Radar are less subject
to such issues, but their radio waves energy are reverberated
by walls in a tunnel (wave guide effect); they can also be
reflected by objects that can be safely overridden (metal
plate, a guardrail or a Botts’ dot).
Authors are with IFSTTAR, COSYS, LIVIC, 77 rue des chantiers, F-
78000, Versailles, France, e-mail: dominique.gruyer@ifsttar.fr
Passive sensors such as cameras provide a refined and
more complete view of the environment at a lower cost.
Visual information is also interesting as recognition of dif-
ferent kind of shapes can be performed on videos (lane de-
tection, traffic sign recognition, visual odometry, pedestrian
detection), so an increasing number of DAS systems already
include one or several on-board cameras. An extensive survey
on visual-based approaches for on-road vehicle detection
and tracking can be found in [4]. Detection methods are
classified into three categories: knowledge-based [5] (edges,
corners, colors, texture), stereo-based [6], [7] (disparity,
inverse perspective mapping) and motion-based [8] (optical
flow).
Systems based solely on computer vision are not powerful
enough to handle complex traffic situations: multiple sensors,
active and passive, are required. They can be used in a
collaborative way as in [7]: a stereoscopic camera rig is
used to validate the targets provided by a laser scanner;
the outputs of the two filtered sensors are then merged by
checking redundancy. In [9], a Lidar and a camera datas are
processed providing a set of targets: the sum rule is used to
combine the classifiers outputs. A more elaborated way of
combining a laser rangefinder and a camera is proposed in
[1] for a traffic surveillance application (sensors are fixed on
the infrastructure). The telemetric data are incorporated in
the likelihood function of a particle filter tracking vehicles
motion in the image. In track-to-track fusion systems [10],
each local sensor data is filtered to provide a list of objects
sent to a central fusion module that fuses all the local sensors
objects lists into a single global objects list. Local sensor-
level tracks are fused asynchronously using the information
matrix fusion algorithm. In these works, the issue of data
association (identifying which object of two sensors corre-
spond to the same target) is not raised.
In this paper, we present a new approach to efficiently
detect and track on-road vehicles using multiple sensors,
namely a laser scanner and a camera: the focus is made
on the issue of data association of simultaneous measure-
ments from multiple sensors. In our approach, detection
and tracking are addressed in a unified framework: targets
coming from laser data processing are used in order to build
and to manage tracks (tracking stage). This tracking step
allows to improve target knowledge by use of temporal and
spatial information. With a propagation module, a confidence
index is computed for each track. This index quantifies the
accumulation of temporal evidence about target existence.
Another issue in the field of vehicle detection and tracking
is the lack of representative benchmarks and evaluation
2013 IEEE/RSJ International Conference on
Intelligent Robots and Systems (IROS)
November 3-7, 2013. Tokyo, Japan
978-1-4673-6357-0/13/$31.00 ©2013 IEEE 5207