(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 10, No. 12, 2019 Vulnerable Road User Detection using YOLO v3 Saranya.K.C 1 School of Electronics Engineering Vellore Institute of Technology Vellore, Tamil Nadu Arunkumar Thangavelu 2 School of Computer Science and Engineering Vellore Institute of Technology Vellore, Tamil Nadu Abstract—Detection and classiﬁcation of vulnerable road users (VRUs) is one of the most crucial blocks in vision based navigation systems used in Advanced Driver Assistance Systems. This paper seeks to evaluate the performance of object classiﬁcation algorithm, You Only Look Once i.e. YOLO v3 algorithm for the purpose of detection of a major subclass of VRUs i.e. cyclists and pedestrians using the Tsinghua – Daimler dataset. The YOLO v3 algorithm used here requires less computational resources and hence promises a real time performance when compared to its predecessors. The model has been trained using the training images in the mentioned benchmark and have been tested for the test images available for the same. The average IoU for all the truth objects is calculated and the precision recall graph for different thresholds was plotted. Keywords—Yolo v3; Tsinghua-Daimler cyclist benchmark; cy- clist detection; pedestrian detection; IoU I. I NTRODUCTION The past decade has witnessed signiﬁcant acceleration in the pace of development of automotive technologies which aim at making driving and commutation safe and facile. Deploy- ment of autonomous driving vehicles and building Advanced Driver Assistance Systems (ADAS) to be used in hybrid vehicles are major steps in realizing this. Of the many ﬁelds related to these, systems related to improving the driving safety such as pre collision systems, crash imminent braking systems play a very crucial role. However, extensive research has been undertaken over the past few years to protect vulnerable road users (VRUs), including pedestrians, cyclists, motorcyclists. Nearly half of the world trafﬁc deaths occur among vulnerable road users, and road trafﬁc injuries are the eighth leading cause of death for all age groups, according to statistical data provided by WHO [1]. Among the many VRU categories, cyclists and pedestrians are the weakest and fall prey to most accidents because of the lack of protection devices. Hence the development of systems for the detection and identiﬁcation of VRUs becomes an essential need of the hour, to make their commutes safer and for the ADAS to be practically and widely deployable. Many approaches based on different sensors are employed in vehicle environment perception systems. The vision based sensors especially monocular cameras, are the most preferred as a standalone or in combination with other sensors when it comes to detection of VRUs, due to the availability of high resolution perception views. Vision based cyclist and pedestrian detection face several challenges due to the diversity in shape, posture, viewpoints, crowded backgrounds, etc. and several algorithms and methodologies have been implemented for the same keeping these considerations in account. II. BACKGROUND Algorithms that are used for the purpose of feature ex- traction and classiﬁcation can predominantly be handcrafted or Deep Learning based. The Haar-like feature detector which uses variations in intensities for the detection of the object [2], [3], the Viola and Jones (VJ) detector designed by Viola et al. [4] which uses a detection approach based on cascaded Haar-like features, which also considers the rapid pixel inten- sity changes ,and the Histogram of Gradients(HOG) detector, suggested by Dalal and Triggs which uses a linear Support Vector Machine for classiﬁcation [5-8] to ﬁnd an object’s characteristics based on the intensities of the local gradients [2], [6] are some of the common hand-crafted features based methods used in general for pedestrian detection. However, hand-crafted methods which rely on low-level features which are manually designed to ﬁnd the ROI’s [9] are not very efﬁcient as features which complex are arduous to handcraft. Au contraire, Deep Learning (DL) based techniques are highly autonomous by allowing the network to determine features. Since the advent of DL, several approaches have been designed for pedestrian or cyclist detection. In the method de- scribed by Wei Tian [10] cyclists in different views and angles are located using cascade detectors. Together with trajectory planning, this model employs an ROI extraction derived based on geometry but achieves only 11 fps when employed in real time. Ren [11] realized an accuracy of 76.47% for an IoU threshold of 0.7 using a Recurrent Rolling Convolution (RRC) architecture employed on multiscale feature maps. Saleh in [12] use a Faster RCNN based network on synthetic image datasets to perform better than the HOG- SVM classiﬁer by 21% in average precision. Felzenswalb [25] designed the Deformable Part Model (DPM) on the basis of HOG detector to undermine the distortions caused due to non-rigid objects. To ensure swift and accurate detection, Yang in [13] used convolutionary object detector with Scale based pooling and CRCs. The scale-dependent pooling allows the identiﬁcation of tiny objects to be improved, and the CRCs help to enhance detection speed by rapidly removing false detections. While all the previously cited works either concentrate on the detection of either the pedestrians or the cyclists, very less literature is available for the simultaneous detection of pedestrians and cyclists [19]. In [5] X. Li propose a uniﬁed framework for both cyclist and pedestrian detection using a UB-MPR based detection combined with Fast RCNN and Fu in [26], propose a system based on symmetry of objects to recognize the features of cyclists and pedestrians that appear in an image. However, this method still does not reach the real time speed requirements due to the complex isolated stages that www.ijacsa.thesai.org 576 | Page