International Journal of Innovative Technology and Exploring Engineering (IJITEE) ISSN: 2278-3075, Volume-9 Issue-3, January 2020 2261 Published By: Blue Eyes Intelligence Engineering & Sciences Publication Retrieval Number: L29241081219/2020©BEIESP DOI: 10.35940/ijitee.L2924.019320 Abstract: Understanding the perspectives of crime happenings by exploiting crime data helps early detection and prevention of crimes. Ruling Government and Policing systems are aware the importance of realising the changing aspects of crime. As technology is advanced, there are many ways to comprehend the trends and the patterns of crime activities. The paper presents a hybrid model using AGNES and K-means clustering algorithms to focus on different views and representation of crime in India and aims to recognize the crime types that cluster the regions of India. Accuracy of model is measured using log loss and the patterns and trends in crime are presented. Keywords: Unsupervised learning, Clustering techniques, Machine learning, Crime data analytics I. INTRODUCTION India is one of the densely populated and developing country. Stable rise in urbanization, high population and poverty, lack of jobs and lack of Education for all are the factors that led to increase in crime rate, every year. The increase in crime rate affects the economic growth and repute of a country and is also a major threat to the citizens to carry out their day to day activities. In this paper, we focus on providing crime data visualization in different perspectives and this paper helps policing systems in understanding crime patterns effectively and in a systematic and smarter way. There are many statistical analysis techniques and machine learning algorithms exist for our disposal. Depending on the dataset, objectives and end users of this system, the techniques and methodology used are probability density functions, cumulative density function, heat maps, word clouds and clustering algorithms. This paper answers for the following questions: Is there any change in the crime rate year wise? Is there any possibility of clustering the locations based on the number of crimes? If so, does this data visualisation show any clusters? What are the other ways of visualising the data? II. LITERATURE STUDIES 2.1 Crime forecasting: The objective of this paper [1] is to foretell crime rate in India every year using the Time Series Revised Manuscript Received on January 5, 2020 * Correspondence Author J Vimala Devi*, Department of Computer Science and Engineering, Visveswaraya Technological University, Bangalore, India. Email: vimalajana@gmail.com Dr Kavitha K S, Department of Computer Science and Engineering, Visveswaraya Technological University, Bangalore, India. Email: drkavitha2015@gmail.com Models such as Auto-Regressive Integrated Moving Average (ARIMA) and Exponential Smoothing. Source of data is from the National Crime Record Bureau of India. While building the predictive model, data is divided into training and test data.. Accuracy of the model is examined and from Hypothesis testing, it is evident that the forecast values are within 95% confidence interval of the test data. This paper concludes that crime forecasting can be done by building time series model. 2.2 Crime analysis using data mining: This paper [2] is devoted to present the perspectives of Using Data mining methods in crime analysis. The Data mining methods discussed in this paper, suggest the process of developing and employing proactive policing strategies for the prevention and investigation of crimes. This paper lists the data mining tools for the efficacy of data analysis required by law-enforcement agencies by designing intelligent tools. It also outlines the models and technologies that automate analytical work of criminal analysts. Based on the behavioural profile of participants in crime activity, this paper is devoted to find the relationships between the actors. This paper is listing the basic principles required to build a real time intellectual system for crime analysis. 2.3 Spatio-temporal crime prediction in rural: This paper[3] proposes a spatio temporal predictive model which finds crime dense regions using k-means and DB scan clustering algorithms, extracts crime predictors using Seasonal Auto Regressive Moving Average model(SARIMA).The results and its implications are presented using appropriate graphs. 2.4 An overview on Crime prediction methods: In this paper[4], a detailed analysis of pros and cons of crime prediction methods are presented. The methods discussed in the paper are Support Vector machine (SVM), Fuzzy methods, Artificial Neural networks and multivariate time series algorithms for time dependant data. The author also suggested that hybrid methods may work better for crime prediction rather than filter methods. III. DATA UNDERSTANDING AND PREPROCESSING A. Data set: The data set recorded the mere count for 30 different types of crimes that happened district wise for every state in India. This data set spans from the year 2001 to 2013. The data set is downloaded from kaggle and is already cleaned. Further aggregations are done based on requirements. The dataset contains 9017 rows and 33 columns. The data set contains numerical value that represents Descriptive Analytics on Crime in India using Clustering Techniques J. Vimala Devi, Kavitha K S