Obstacle Detection and Classiﬁcation fusing Radar and Vision M. Bertozzi, L. Bombini, P. Cerri and P. Medici P. C. Antonello and M. Miglietta VisLab – Dipartimento di Ingegneria dell’Informazione CRF - Centro Ricerche Fiat Universit` a degli Studi di Parma, ITALY I-10043 Orbassano(TO), ITALY http://vislab.it maurizio.miglietta@crf.it {bertozzi,bombini,cerri,medici}@vislab.it pierclaudio.antonello@crf.it Abstract— This paper presents a system whose aim is to detect and classify road obstacles, like pedestrians and vehicles, by fusing data coming from different sensors: a camera, a radar, and an inertial sensor. The camera is mainly used to reﬁne the vehicles’ boundaries detected by the radar and to discard those who might be false positives; at the same time, a symmetry based pedestrian detection algorithm is executed, and its results are merged with a set of regions of interest, provided by a Motion Stereo technique. The tests have been performed in several environments and trafﬁc situations, their results showed how the vision based ﬁltering provides an effective reduction of radar’s false positives; furthermore, the regions of interest detected by the Motion Stereo algorithm, truly improves the pedestrian detector’s performance again by keeping low the number of detection errors. The system has been shown during the APALACI-PReVENT European IP ﬁnal demonstration 1 in September 2007 in Ver- sailles (France). I. I NTRODUCTION This paper describes an obstacle detection and classiﬁ- cation system using different methods to detect regions of interest. It exploits a vehicle detection algorithm, based on fusion of camera images and radar data [1], [2], to detect vehicles, while a pedestrian detection algorithm [3], [4] is exploited to detect the presence of potential pedestrians and, ﬁnally, a motion stereo technique is also used to ﬁnd obstacles and to reﬁne pedestrian detection results. Radar is robust against bad weather, rain and fog; it can measure speed and distance of an object, but it does not provide enough data points to detect obstacle boundaries, and experimental results show that radar is not reliable to detect small obstacles like pedestrians. Vision-based system can cope with this lack in localization and, moreover, other tasks can be performed using the same sensor. Some vision-based systems [5] for obstacle avoidance exploit stereo sensors. They performs a 3D world reconstruc- tion of the scene through the triangulation of homologous points. In special setup calibration or using image rectiﬁca- tion is possible to look for the same feature on the same row between the images couple with good performance and low computational resource. 1 The work described in this paper has been developed in the framework of the Integrated Project APALACI - PReVENT, a research activity funded by the European Commission to contribute to road safety by developing and demonstrating preventive safety technologies and applications. However, the engineering of stereo-based systems on vehicles is complex, due to the excessive cost, the connection between cameras, computation engine, and miscalibration issues. A potential solution of this problem can be found in the use of motion stereo: a technique that allows the recovery of three dimensional informations from motion as binocular stereo vision. Two different approaches can be used to perform motion stereo: 3-D reconstruction and warped image comparison. In the ﬁrst approach, points of interest are tracked and matched over the frames: they can be chosen and tracked using Kanade-Lucas-Tomasi technique [6] or, simply, using optical ﬂow on strong edges, for example corners [7], [8]. Under the assumption of static world, it is possible to extract the vehicle ego-motion and to obtain a 3D scene reconstruction. The difﬁculty of tracking reliable features and subsequent error propagation decrease the performance of this method. This approach provides good performance to recover Struc- ture from Motion (SFM) like in park assistant systems, but is not generally used in more dynamic scenes like motorways. An improvement of this approach is described in [6] where a vision-radar fusion is developed and radar is used to classify features associate to static or dynamic obstacles. In the second approach, the ego-motion is instead com- puted according to rotation and translation parameters pro- vided by the inertial sensors, thus no more features tracking is needed; hence this approach is quite fragile, since it heavily relies on ﬁne camera calibration and good odometric data, in order to provide reliable results. Aubert et al proposed an approach of motion stereo for obstacle detection using warped images [9], that are at each cycle compared against the previous one. Since the warped images are computed under the ﬂat-world assumption and the ego-motion compensation is applied using the data provided by odometry and a gyroscope, the differences between that two images are then attributable to vertical objects not satisfying the initial assumption; in this way it is possible to compute a V-disparity like image in order to detect obstacles. In Batavia et al [10] only edges are warped and the predicted position is compared with the current one. Pitch ﬂuctuation produced by vehicle vibration are rejected using edge tracking, called 1-dimensional optic ﬂow. This paper presents a different approach for the motion 2008 IEEE Intelligent Vehicles Symposium Eindhoven University of Technology Eindhoven, The Netherlands, June 4-6, 2008 978-1-4244-2569-3/08/$20.00 ©2008 IEEE. 608