DOI: https://doi.org/10.51846/vol5iss4pp36-43 Pakistan Journal of Engineering and Technology, PakJET ISSN (p): 2664-2042, ISSN (e): 2664-2050 Volume: 5, Number: 4, Pages: 36- 43, Year: 2022 This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. 36 Smart Surveillance and Detection Framework Using YOLOv3 Algorithm Hassan Zaki 1 , Muhammad Kashif Shaikh 1 , Muhammad Tahir 1 , Muhammad Naseem 1 , Muzammil Ahmed Khan 2 1 Department of Software Engineering, Sir Syed University of Engineering & Technology, Karachi, Pakistan. 2 Department of Computer Engineering, Sir Syed University of Engineering & Technology, Karachi, Pakistan. Corresponding author: Muhammad Kashif Shaikh (e-mail: mkshaikh@gmail.com). Received: 13-07-2022 Revised: 21-11-2022 Accepted:13-12-2022 Abstract- The outlines of a video taken from an observation camera are becoming interesting these days, and much research work is to be done with fruitful results. The objective of this research is also to take care of such methods, which could be used to locate, identify and admit such activities by wasting no time, which could be done by getting the outlines of a video from an observation camera. The crucial point of this article is to explore the research more and consider such activities in real-time to have the video from a reconnaissance camera. The activity could get the message, follow the given time space and accordingly provide an activity based on that time-space in a single format. YOLO provides a better platform with better strategy and fast response than the localization within the customized dataset. The research is based on the new results, and analysis is given for insight into the proposed work. The given method proves the activity-based analysis and shows its application's suitability. The results are developed, showing a better agreement with the previous work and faster response time. The proposed work is applicable in many different ways, in shopping malls, automatic teller machines, corporate offices, residential and societies. The work is also useful in detecting ideal human actions. Index Terms-- Video Analytics, Human Action Recognition, action label, deep learning, Custom Dataset, You Only Look Once (YOLO), Convolutional Neural Network. I. INTRODUCTION Although, despite the later improvements in profound learning, exceptionally few profound learning-based strategies have been proposed to handle the issue of viciousness location from recordings. Convolutional Neural Systems (CNN) may be a strategy that's broadly embraced by analysts all over the world. For picture classification issues owing to the colossal victory of CNNs in examining a picture and its substance, examiners have initiated utilizing CNN for video investigation to a more noticeable degree. Profound learning methods are not application-specific, unlike the hand-crafted feature-based strategies, since a profound neural arrangement demonstrates can be effectively connected for a diverse errand without any critical changes to the design. With moved forward execution, numerous diverse methods for tending to issue question location, following, acknowledgment activity acknowledgment, caption era, etc., have been created as a result. In this term paper, we proposed an idea that can consequently screen reconnaissance recordings and recognize the savage behaviour of people that will be of significant assistance to the law-and-order establishment. If you're utilizing these days, the rates of savage violations have extended profoundly, a fear assault includes one or numerous people with weapons and blades or it may be a battle or capturing. That has come about in the colossal utilization of reconnaissance cameras that made a difference to the specialists in recognizing savage assaults and performing the vital steps to play down lamentable impacts. In our get-to, our appearance can notice and limit works out from fewer video diagrams (regularly in fact reasonable from disconnected diagrams). The appearance we proposed here allows an unprecedented activity title and certainty score for each diagram with an around-the- world world movement title for video gathering is gotten by finding the visit development title [1]. Occasional outlines from the video arrangement are prepared but not all the videos. Too, in most of the basics, a separate or few outlines is adequate for acknowledgment of the activity of people shown within the video [2]. In case of quick lessening of certainty grade, more outlines from the video are included to recognize. Within the display work, we have utilized the least number of outlines in this way decreasing the calculation time complexity [1]. As shown in Fig. 1.