International Journal of Science and Research (IJSR) ISSN (Online): 2319-7064 Index Copernicus Value (2015): 78.96 | Impact Factor (2015): 6.391 Volume 6 Issue 4, April 2017 www.ijsr.net Licensed Under Creative Commons Attribution CC BY Traffic Surveillance Using Image Recognition on Distributed Platform Tapan Kumar Hazra 1 , Shaif Choudhury 2 , Soummyo Priyo Chattopadhyay 3 1 Institute of Engineering & Management, Department of Information Technology, Affiliated to Maulana Abul Kalam Azad University of Technology, West Bengal ( Formerly Known as West Bengal University of Technology), Y-12, Salt Lake Electronics Complex, Sector – V, Kokata-700091, India 2, 3 Institute of Engineering & Management, Kolkata, Department of Information Technology, Kolkata, India Abstract: With the spread of low-priced, lightweight video cameras the amount of video data generated has increased tremendously. These days surveillance cameras are installed everywhere from enterprises to public places in traffic signals etc. Generally these clips are stored as video files corresponding to several hours. Now, there is a need to process such huge amount of data mostly to recognize and detect certain objects within a scene. The major challenges faced with video data are that it's unstructured with high volume. Distributed computing is one of the techniques that can be used to efficiently analyze such large amount of data. In this paper we have proposed a Hadoop-Mapreduce based model to carry out video surveillance. In this paper, we have proposed an architecture that creates image frames from videos and then applies image recognition algorithm using MapReduce. As an use case we have discussed the application of the proposed surveillance system for processing video data captured at real-time traffic scene, CCTV footages where there is a demand for sophisticated software systems. Keywords: Distributed Computing, Hadoop, Map-Reduce, HiPi, Video Surveillance, Traffic Surveillance, ffmpeg. 1. Introduction The amount of data in the world is exploding. In fact 90% of world's data is created in last two years and 80 % of it is unstructured data. Currently there are handful websites like YouTube, vimeo where thousands of people upload videos. But then, if we look around there are surveillance cameras in every corner. A typical enterprise has cameras working 24*7 and generating gigabytes of data. So the need for a robust method to store and analyze video data is increasing. For, example, police and security staffs are dependent on surveillance cameras. The main area of focus is on areas like traffic and metro stations. There are thousands of cars passing through the traffic each day. The function of a surveillance system would be to detect a particular car passing through the traffic using a database of different car models. The existing systems here mostly deal with counting number of cars on road traffic [4] or detecting particular car models from CCTV footages [11]. Proposed models for distributed image processing is generally based on face detection on massive dataset [2] [3]. Here we would like to propose a system that uses Hadoop HiPi along with MapReduce to get desired output. First challenge for this system would be to store large video files and then run image processing algorithms to detect objects. The second challenge would be to create an efficient application that can take advantage of parallel computing. So , we would like to propose a surveillance system that is scalable as well as efficient. Our idea is simple. Given a large set of video file we would like to match an input image file with video data and track the time. Hadoop file system can be used for storage. Hadoop is an open source software framework that provides tools for large scale data analytics. Hadoop provides facility to store binary files. So anything that can be converted to binary files can be stored in HDFS file system. Moreover, there are lots of easy to use commercial Hadoop platforms are available on the market. The best way to process large amount of video files is to create image frames and store into multiple clusters. For this purpose different video processing tools can be used like VirtualDub, ImageGrab, FFMpeg. To run image processing algorithm HiPi can be used. HIPI is an image processing library designed to be used with the Apache Hadoop MapReduce parallel programming framework. HIPI facilitates efficient and high-throughput image processing with MapReduce style parallel programs typically executed on a cluster. The rest of this paper is organized as follows. Section 2 presents related work. In section 3 we discuss the different components of the proposed system. In section 4 we discuss the architecture of the proposed system and problems arising when processing video databases by using MapReduce on Hadoop in a distributed environment. Finally, we propose future work in section 5 conclude the paper. 2. Related Work Previous research work on this generally deals with image processing on distributed framework [1] like Hadoop. Other proposed surveillance systems proposed in past generally deal with face detection using openCV [3] or road traffic analysis using blob tracking method[4]. Recently with the shift towards Big Data there has been a lot of techniques proposed for the given problem. Paper ID: ART20172256 DOI: 10.21275/ART20172256 612