International Journal of Engineering Technology and Management Sciences Website: ijetms.in Issue: 2 Volume No.7 March - April 2023 DOI:10.46647/ijetms.2023.v07i02.010 ISSN: 2581-4621 @2023, IJETMS | Impact Factor Value: 5.672 | Page 75 A Novel Intrusion Detection System Using Multiple Linear Regression Koushik Paul 1 , Sayandeep Paik 2 , Siddhartha Kuri 3 , Soumyadip Majumder 4 , Dr. Avijit Kumar Chaudhuri 5 1, 2, 3, 4 B. Tech, Department of CSE, TEC Banipur, West Bengal, India 5 Assistant Professor, Department of CSE, TEC Banipur, West Bengal, India Corresponding Author Orcid ID: (0000-0003-4117-7067) 1 (0000-0002-7114-2378) 2 (0000-0003-3443-9639) 3 (0000-0002-7596-6833) 4 (0000-0002-5310-3180) 5 Abstract The internet is no doubt the biggest and the most important tool of modern civilisation. But along with its numerous benefits, it also comes with its own set of risks, the most important of them being breaches in security and privacy. An anomaly-based Intrusion Detection System (IDS) is a type of security system that is used to detect and alert on unusual or abnormal behaviour that may indicate an attack or intrusion. Unlike signature- based IDS, which rely on known patterns of attack, anomaly-based IDS is designed to detect previously unseen or unknown attacks by identifying deviations from normal patterns of behaviour. Multiple linear regression is a statistical technique used to analyse the relationship between a dependent variable and multiple independent variables. In this technique, a linear equation is established between the dependent variable and multiple independent variables, with the aim of predicting the value of the dependent attribute for a given set of values of the independent attribute. In this paper, we collected a data set of 125974 entries and 42 attributes from Kaggle, pre-processed the data and used logistic regression to predict the dependent variable (called xAttack) using 25 independent variables, as we found a high correlation between the aforementioned variables The results are simulated using 10-fold cross validation, using various train test splits of the data set. The data has been split into 80-20,50-50, and 66-34. After testing the given data set in different train test splits, an accuracy of 92.73 was achieved. Keywords: Intrusion Detection System (IDS), Machine Learning, Multiple Linear regression, security breach. 1. Introduction The internet has become an essential tool in modern society. A huge amount of essential and confidential data is present on the internet. This data might be extremely important for the security of the Host. But data on the internet is always at a risk of infringement. As a result of the recent pandemic COVID-19, a lot of employees were encouraged to work from home. This has led to a massive surge in the transmission of sensitive data online, requiring the employers to provide a safe working environment. Therefore, we need a means of security that protects us against possible cyber-attacks. An Intrusion Detection System (IDS) is a security technology designed to detect and prevent unauthorized access or malicious activity on a computer network or system. Its primary purpose is to identify and respond to potential security breaches and attacks, alerting security personnel or automated response systems to take action. This action is performed using Multiple Linear Regression (MLR). Multilinear regression is a statistical method used to analyze the relationship between multiple independent variables and a dependent variable. It is a sort of linear regression where the dependent variable is a linear combination of multiple independent variables. 2. Relevant Literature Today, security has become a critical concern for individuals, businesses, and governments alike. With the increasing reliance on technology and the internet, the risk of cyber-attacks, data breaches, and other forms of digital threats has also risen.