International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 06 Issue: 06 | June 2019 www.irjet.net p-ISSN: 2395-0072
© 2019, IRJET | Impact Factor value: 7.211 | ISO 9001:2008 Certified Journal | Page 2092
Estimation of Crowd Count in a Heavily Occulated Regions
Swathi D G
1
, Jalaja G
2
1
Student, Department of Computer Science and Engineering, BNMIT, Karnataka, India
2
Associate Professor, Department of Computer Science and Engineering, BNMIT, Karnataka, India
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - Crowd estimation is a challenging task of
accurately estimating number of people in a crowd region.
This paper aims to address crowd counting problem from the
perspective of two models i.e, body part map and structural
density map. The two models are created by combining the
information of pedestrian, their head and context structure.
Deep Convolutional neural networks and motion detection
method is used to count the number of people in the crowd
region, based on the pixel movement of the video frames. CNN
technique improves the efficiency of counting people in videos
and high accuracy is achieved.
Key Words: Crowd Counting, Deep Convolutional Neural
Networks, Motion Detection, Pedestrian Detection,
Crowd Estimation.
1.INTRODUCTION
Crowd estimation is the task of efficiently estimating number
of pedestrians in a dense region. Crowd counting has
harassed much curiosity from scientist due to the practical
stipulation like for controlling large number of pedestrians
and public security. Detection of a human is a basic issue in
video supervision systems. It is estimated that the world
population will be 11.2 billion in 2100 years, which is double
the current population of the world (7.4 billon, 2016). Due to
rapidly growing population across the world, crowd analysis
and crowd monitoring has become an important field for
research. Manually counting people in the dense crowded
areas user cannot estimate the accurate crowd count of the
pedestrians present in the area. To overcome this, a system
is developed to provide crowd count. Crowd count is any
dense scene is provided based on three key factors:
pedestrian, head and context structure, are planned as two
scene models. The first model is body-parts map, which is
obtained by finding the body parts of individual person in
dense scene and merging the segmentation mask. The
second model is structural-density map, which is created
based on shape of individual persons obtained from body-
parts map. Then result of two models are combined to
provide crowd count of the dense scene. There are several
applications of crowd counting some of them are listed
below: -
Safety monitoring: - Video surveillance camera used in
public place for the safety and security of the people
may break down due to limitation in the algorithm
design of the system. In such scenarios, crowd counting
system can used for event detection, congestion control
and behavioral analysis.
Intelligence gathering and analysis: - In malls and
airport, depending on the number of people entering or
length of queue the counters can be set up so that no
human resource is wasted.
Designing a public place: - Crowd counting system can
be used to design public space like mall, stadium, rail
tracks etc.
2. RELATED WORK
Cross scene crowd estimation is a difficult task, where no
arduous data notations are required for estimating people
count of dense crowd scene. Deep convolutional neural
network (CNN) classifier is pre-trained to provide crowd
count of the dense scene-based crowd density. A new dataset
including 108 crowd images with 200000 head notations
was introduced to better evaluate accuracy of cross-scene
crowd estimation methods. To evaluate the efficiency and
reliability of the method experiment was held on already
existing datasets i.e, UCSD, UCF_CC_50 and WorldExpo’10
dataset. Cross-scene system fails to provide accurate count
of the dense crowd scene [1]. Pedestrian analysis is
challenging due to the gesture variation, obstruction,
appearance and background clutters. Deep Decompositional
network (DNN) classifier was used for parsing crowded
images into different human parts such as face, hairs, hands,
legs and body. Deep decompositional network together
estimates obstructed regions and body parts of person by
arranging three hidden layers: obstruction estimation layers,
completion layers and decompositional layers. Pedestrian
parsing method by DNN provides better accuracy than state-
of-art method on crowded images with or without
obstruction. The experiment was conducted on large
benchmark PPSS dataset for evaluating the efficiency and
reliability of pedestrian parsing method by DNN. The DNN
system fails to work efficiently in heavy crowded scene [2].
Global regression methods are used for mapping low level
features (texture, edge information and segmentation mask)
of humans to provide crowd count of the dense scene. The
system is evaluated over USCD dataset. The system ignores
the spatial information and body structure information of
pedestrian, thus fails to provide accurate crowd count of
crowded scene [3]. The head is the most visible part from
any crowded scene. The head detection is based on advance
method of boosted essential features. To reduce a search
region a novel point estimator base on gradient adjustment
features to identify region similar to the head region from