Comparative Study of Machine Learning Models for Crime Analysis Yashvi Mehta 1 , Sakshi Mahadik 2 , Manan Shah 3 , Jaiwin Shah 4 and Dr. Sudhir Dhage 5 1-5 Computer Engineering Sardar Patel Institute of Technology, Mumbai, India Email: yashvi.mehta@spit.ac.in, sakshi.mahadik@spit.ac.in, manan.shah@spit.ac.in, jaiwin.shah@spit.ac.in, sudhirdhage@spit.ac.in Abstract—In today’s world where criminal activity is increas- ing daily, it’s necessary to curb these crimes. Analysis of crime is a procedural approach to identifying crimes. Crimes can reduce if the criminal hotspots get identified. If these hotspots are pinned down based on the crime type, crime detection and analysis can. With the machine learning techniques applicable to big datasets, crime investigators could use these approaches to narrow down their search and solve the cases. We intend to preprocess the data and find these criminal hotspots to improve our search. In this paper, we have implemented several machine learning algorithms such as decision tree regression, linear regression, random forest, and other algorithms to analyze crime patternsin the United States. A comparative study is done to see which algorithm would yield better results in terms of accuracy. Index Terms— Crime Analysis,Decision Tree Regression, K- Nearest Neighbors , Linear Regression, Support Vector Machine, Random Forest. I. INTRODUCTION Criminal activities are gradually rising around us. To reduce them, high crime rate areas need to be determined. The first step is to segregate different categories of crimes committedin that neighborhood. The main purpose of crime analysis is to reduce these organized criminal activities. Crime prediction can be employed by identifying these hotspots and reporting them to the concerned authority, thereby helping to decrease the crime rate. It can be done by taking the help of various machine learning models. It would utilize the existing crime data and predict the crime type and its occurrence. The major goal behind this research is to identify crimes that can get predicted once the required data gets filtered out. This filtered data would lead to the finding of patterns that would come in handy to identify the criminal activities in a state. The dataset taken consists of a variety of attributes. After gathering the data, the data is cleaned and divided into train and test sets. Supervised techniques like K-nearest neigh- bour, Linear Regression, Decision Tree, Random Forest, and Support Vector Machine are used for crime prediction. The results from this prediction can be very useful for the police department in investigating cases and reducing the crime rate by taking appropriate measures which will secure the people and the region. II. LITERATURE SURVEY Akash Kumar, Aniket Verma, Gandhali Shinde, Yash Sukhdeve, and Nidhi Lal [1] proposed a crime prediction Grenze ID: 01.GIJET.9.1.31 © Grenze Scientific Society, 2023 Grenze International Journal of Engineering and Technology, Jan Issue