A computer vision-based object detection and counting for COVID-19 protocol compliance: a case study of Jakarta Muhammad Lanang Afkaar Ar 1 , Sulthan Muzakki Adytia S. 1 , Yudhistira Nugraha 2,3, *, Farizah Rizka R. 2 , Andy Ernesto 2 , Juan Intan Kanggrawan 2 , Alex L. Suherman 4 1 Computer Engineering, School of Electrical Engineering, Telkom University, Bandung 40257, Indonesia 2 Jakarta Smart City, Department of Communication, Informatics, and Statistics, Jakarta 10110, Indonesia 3 School of Computing, Telkom University, Bandung 40257, Indonesia 4 Directorate of Research and Community Service, Telkom University, Bandung 40257, Indonesia *Corresponding author: yudhistira.nugraha@jakarta.go.id, yudhistiranugraha@telkomuniversity.ac.id Abstract—The chaotic world situation caused by the SARS-CoV- 2 virus (COVID-19 pandemic) has hampered many sectors of human activity, especially in activities that require physical interactions. Thus, requiring social restrictions for those sectors that are affected. This paper reports the analysis of the proposed system for monitoring and supporting public activities in order to carry out social restrictions, specifically in the DKI Jakarta province. The proposed systems are YOLO and MobileNet SSD as its main weight to help this detection system with 30% and 40% confidence, respectively. The results of object counting and physical distancing are expected to be a guideline for public complaints in the future by using several CCTV locations points with better image quality and better angles. Keywords—COVID-19, smart mobility, computer vision, social distancing, large-scale social restriction, YOLO, MobileNet SSD, Artificial Intelligence, Jakarta I. INTRODUCTION The human eye is one of the gifts of God owned by humans to see and monitor what exists on earth. With these organs, we can recognize an object easily without touching the object or without a specific description. Therefore, one of its functions is to perform tasks to monitor the condition. However, because the organ is attached to the human body, it requires rest to work not as machines that will not be exhausted. Thus, the task of a human being can be managed by using a camera and a computer to help human tasks in this era of informatics. Currently, vehicle detection and counting system play an essential role in public monitoring activities. Almost all spots in Jakarta have been surrounded by the installed Closed-Circuit Television (CCTV). Many functions or objects obtained from CCT can help us to know the congestion point on each road. They are giving a sense of security to everyone if travelling at night and day, monitoring the strike activities, and knowing the problems that exist in Jakarta [9-15]. The problem in Jakarta and other cities around the world is minimizing or stop the spread of COVID-19 virus. The Jakarta provincial government has made several efforts on this issue. One such effort is to issue governor regulation number 51 of the year 2020, concerning the implementation of large-scale social restrictions during the transition period[1]. Based on this regulation, we saw a transition from large-scale social restrictions phase to new normal phase, where COVID-19 still linger in the society and increasing the reported cases happen. Data from CCTV were analyzed to see the people compliance with the policy of large-scale social restriction and wearing masks policy. This paper aims to analyze public monitoring activities by using existing CCTVs with artificial intelligence integration to © IEEE 2021. This article is free to access and download, along with rights for full text and data mining, re-use and analysis. optimize surveillance and monitoring system for COVID-19 protocol compliance. The CCTV systems can estimate people and vehicle density in certain areas (e.g. public areas). Further, this analysis can detect people with or without mask or ensure social distancing measures applied one to another. This approach can help support the government effectively of large- scale social restriction (PSBB). The remainder of this paper is structured as follows. Section 2 provides a data condition for the sensor and the CCTV data. In Section 3, we describe methods used to conduct public mobility monitoring activities. Section 4 presents the findings and innovation used in this paper. Finally, Section 5 presents the conclusion and future development. II. DATA CONDITION For the sensor and the CCTV data related to automation object counting and monitoring of physical distancing and mask, the data were collected with the support of PT. Bali Tower. The analysis was conducted using the recorded CCTV of PT. Bali Tower that integrated into the website smartcity.jakarta.go.id. We use CCTV located in the Pintu Gelora area near FX with a period of 3 hours from 07.00 a.m. to 10.00 a.m. during the car-free day (CFD). Figure 1. An image is taken from CCTV for data analysis Based on Figure 1, the CCTV has a low picture-taking angle, so it will be rather tricky for video processing or images to be taken for detection. We take pictures or videos in the morning. Hence, it also affects the results of CCTV videos on lighting. This video was selected for further analysis because there is sufficient information that can be representing large- scale social restrictions. /20/$31.00 ©2020 IEEE 978-1-6654-0422-8