A computer vision-based object detection and counting for COVID-19
protocol compliance: a case study of Jakarta
Muhammad Lanang Afkaar Ar
1
, Sulthan Muzakki Adytia S.
1
, Yudhistira Nugraha
2,3,
*, Farizah Rizka R.
2
,
Andy Ernesto
2
, Juan Intan Kanggrawan
2
, Alex L. Suherman
4
1
Computer Engineering, School of Electrical Engineering, Telkom University, Bandung 40257, Indonesia
2
Jakarta Smart City, Department of Communication, Informatics, and Statistics, Jakarta 10110, Indonesia
3
School of Computing, Telkom University, Bandung 40257, Indonesia
4
Directorate of Research and Community Service, Telkom University, Bandung 40257, Indonesia
*Corresponding author: yudhistira.nugraha@jakarta.go.id, yudhistiranugraha@telkomuniversity.ac.id
Abstract—The chaotic world situation caused by the SARS-CoV-
2 virus (COVID-19 pandemic) has hampered many sectors of
human activity, especially in activities that require physical
interactions. Thus, requiring social restrictions for those sectors
that are affected. This paper reports the analysis of the proposed
system for monitoring and supporting public activities in order to
carry out social restrictions, specifically in the DKI Jakarta
province. The proposed systems are YOLO and MobileNet SSD
as its main weight to help this detection system with 30% and 40%
confidence, respectively. The results of object counting and
physical distancing are expected to be a guideline for public
complaints in the future by using several CCTV locations points
with better image quality and better angles.
Keywords—COVID-19, smart mobility, computer vision, social
distancing, large-scale social restriction, YOLO, MobileNet SSD,
Artificial Intelligence, Jakarta
I. INTRODUCTION
The human eye is one of the gifts of God owned by humans
to see and monitor what exists on earth. With these organs, we
can recognize an object easily without touching the object or
without a specific description. Therefore, one of its functions
is to perform tasks to monitor the condition. However, because
the organ is attached to the human body, it requires rest to work
not as machines that will not be exhausted. Thus, the task of a
human being can be managed by using a camera and a
computer to help human tasks in this era of informatics.
Currently, vehicle detection and counting system play an
essential role in public monitoring activities. Almost all spots
in Jakarta have been surrounded by the installed Closed-Circuit
Television (CCTV). Many functions or objects obtained from
CCT can help us to know the congestion point on each road.
They are giving a sense of security to everyone if travelling at
night and day, monitoring the strike activities, and knowing the
problems that exist in Jakarta [9-15].
The problem in Jakarta and other cities around the world is
minimizing or stop the spread of COVID-19 virus. The Jakarta
provincial government has made several efforts on this issue.
One such effort is to issue governor regulation number 51 of
the year 2020, concerning the implementation of large-scale
social restrictions during the transition period[1]. Based on this
regulation, we saw a transition from large-scale social
restrictions phase to new normal phase, where COVID-19 still
linger in the society and increasing the reported cases happen.
Data from CCTV were analyzed to see the people compliance
with the policy of large-scale social restriction and wearing
masks policy.
This paper aims to analyze public monitoring activities by
using existing CCTVs with artificial intelligence integration to
© IEEE 2021. This article is free to access and download, along with rights
for full text and data mining, re-use and analysis.
optimize surveillance and monitoring system for COVID-19
protocol compliance. The CCTV systems can estimate people
and vehicle density in certain areas (e.g. public areas). Further,
this analysis can detect people with or without mask or ensure
social distancing measures applied one to another. This
approach can help support the government effectively of large-
scale social restriction (PSBB).
The remainder of this paper is structured as follows.
Section 2 provides a data condition for the sensor and the
CCTV data. In Section 3, we describe methods used to conduct
public mobility monitoring activities. Section 4 presents the
findings and innovation used in this paper. Finally, Section 5
presents the conclusion and future development.
II. DATA CONDITION
For the sensor and the CCTV data related to automation object
counting and monitoring of physical distancing and mask, the
data were collected with the support of PT. Bali Tower. The
analysis was conducted using the recorded CCTV of PT. Bali
Tower that integrated into the website smartcity.jakarta.go.id.
• We use CCTV located in the Pintu Gelora area near FX
with a period of 3 hours from 07.00 a.m. to 10.00 a.m.
during the car-free day (CFD).
Figure 1. An image is taken from CCTV for data analysis
• Based on Figure 1, the CCTV has a low picture-taking
angle, so it will be rather tricky for video processing or
images to be taken for detection.
• We take pictures or videos in the morning. Hence, it also
affects the results of CCTV videos on lighting.
• This video was selected for further analysis because there
is sufficient information that can be representing large-
scale social restrictions.
/20/$31.00 ©2020 IEEE 978-1-6654-0422-8