International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-9 Issue-3, January 2020
2261
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Retrieval Number: L29241081219/2020©BEIESP
DOI: 10.35940/ijitee.L2924.019320
Abstract: Understanding the perspectives of crime
happenings by exploiting crime data helps early detection and
prevention of crimes. Ruling Government and Policing systems
are aware the importance of realising the changing aspects of
crime. As technology is advanced, there are many ways to
comprehend the trends and the patterns of crime activities. The
paper presents a hybrid model using AGNES and K-means
clustering algorithms to focus on different views and
representation of crime in India and aims to recognize the crime
types that cluster the regions of India. Accuracy of model is
measured using log loss and the patterns and trends in crime are
presented.
Keywords: Unsupervised learning, Clustering techniques,
Machine learning, Crime data analytics
I. INTRODUCTION
India is one of the densely populated and developing
country. Stable rise in urbanization, high population and
poverty, lack of jobs and lack of Education for all are the
factors that led to increase in crime rate, every year. The
increase in crime rate affects the economic growth and repute
of a country and is also a major threat to the citizens to carry
out their day to day activities.
In this paper, we focus on providing crime data
visualization in different perspectives and this paper helps
policing systems in understanding crime patterns effectively
and in a systematic and smarter way. There are many
statistical analysis techniques and machine learning
algorithms exist for our disposal. Depending on the dataset,
objectives and end users of this system, the techniques and
methodology used are probability density functions,
cumulative density function, heat maps, word clouds and
clustering algorithms.
This paper answers for the following questions:
Is there any change in the crime rate year wise?
Is there any possibility of clustering the locations
based on the number of crimes? If so, does this data
visualisation show any clusters?
What are the other ways of visualising the data?
II. LITERATURE STUDIES
2.1 Crime forecasting: The objective of this paper [1] is to
foretell crime rate in India every year using the Time Series
Revised Manuscript Received on January 5, 2020
* Correspondence Author
J Vimala Devi*, Department of Computer Science and Engineering,
Visveswaraya Technological University, Bangalore, India. Email:
vimalajana@gmail.com
Dr Kavitha K S, Department of Computer Science and Engineering,
Visveswaraya Technological University, Bangalore, India. Email:
drkavitha2015@gmail.com
Models such as Auto-Regressive Integrated Moving Average
(ARIMA) and Exponential Smoothing. Source of data is from
the National Crime Record Bureau of India. While building
the predictive model, data is divided into training and test
data.. Accuracy of the model is examined and from
Hypothesis testing, it is evident that the forecast values are
within 95% confidence interval of the test data. This paper
concludes that crime forecasting can be done by building time
series model.
2.2 Crime analysis using data mining: This paper [2] is
devoted to present the perspectives of Using Data mining
methods in crime analysis. The Data mining methods
discussed in this paper, suggest the process of developing and
employing proactive policing strategies for the prevention
and investigation of crimes. This paper lists the data mining
tools for the efficacy of data analysis required by
law-enforcement agencies by designing intelligent tools. It
also outlines the models and technologies that automate
analytical work of criminal analysts. Based on the behavioural
profile of participants in crime activity, this paper is devoted
to find the relationships between the actors. This paper is
listing the basic principles required to build a real time
intellectual system for crime analysis.
2.3 Spatio-temporal crime prediction in rural: This
paper[3] proposes a spatio temporal predictive model which
finds crime dense regions using k-means and DB scan
clustering algorithms, extracts crime predictors using
Seasonal Auto Regressive Moving Average
model(SARIMA).The results and its implications are
presented using appropriate graphs.
2.4 An overview on Crime prediction methods: In this
paper[4], a detailed analysis of pros and cons of crime
prediction methods are presented. The methods discussed in
the paper are Support Vector machine (SVM), Fuzzy
methods, Artificial Neural networks and multivariate time
series algorithms for time dependant data. The author also
suggested that hybrid methods may work better for crime
prediction rather than filter methods.
III. DATA UNDERSTANDING AND
PREPROCESSING
A. Data set: The data set recorded the mere count for 30
different types of crimes that happened district wise for every
state in India. This data set spans from the year 2001 to 2013.
The data set is downloaded from kaggle and is already
cleaned. Further aggregations are done based on
requirements. The dataset
contains 9017 rows and 33
columns. The data set contains
numerical value that represents
Descriptive Analytics on Crime in India using
Clustering Techniques
J. Vimala Devi, Kavitha K S