International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 07 Issue: 09 | Sep 2020 www.irjet.net p-ISSN: 2395-0072 © 2020, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2802 Performance Analysis of Different Machine Learning Techniques for Anomaly-based Intrusion Detection Ambreen Sabha 1 , Lalit Sen Sharma 2 1 M.Tech. Student, Department of Computer Science and IT, University of Jammu, J&K, India 2 Professor, Department of Computer Science and IT, University of Jammu, J&K, India ---------------------------------------------------------------------***---------------------------------------------------------------------- Abstract - An Intrusion is an activity that compromises the confidentiality or the availability of the resource. An Intrusion Detection System is a device or the software that monitors the state of the network for any unauthorized access or any policy violations. The objective of the current research work is to compare the performance between different machine learning techniques using an anomaly-based intrusion dataset. For the proposed study, three supervised machine learning techniques namely Naïve Bayes, Decision Tree, and Random Forest have been applied to the dataset. To assess the performance of each machine learning technique; four parameters namely accuracy, recall, precision, and f-score have been evaluated. Experimentation is performed on the NSL-KDD dataset, which is based on the different sets of features. The detection accuracy and the execution time taken by the machine learning algorithms are analyzed. Random Forest obtained the highest accuracy of 97.8% and execution time of 0.998 milliseconds compared to that of the Decision Tree and Naïve Bayes. The detection accuracy of all the four attacks which were present in the dataset is DoS 99%, Probe 99%, R2L 98%, and U2R 99%, using the proposed research machine learning algorithm as Random Forest. Key Words: Decision Tree, IDS, Machine Learning, NSL- KDD, Naïve Bayes, Network Security, Random Forest. 1. INTRODUCTION The Intrusion detection system (IDS) is a device or software which monitors the network for any malicious activity. An IDS is a tool that works with the network to keep it secure and alert when somebody is trying to break into your system [6]. Intrusion detection is the problem of identifying unauthorized use and abuse of computer systems by system insiders and external intruders and it is the process of detecting malicious patterns in the large data sets. Intrusion detection systems is classified into two different categories as Host-based intrusion detection system i.e. HIDS and Network-based intrusion detection system i.e. NIDS. Host- based IDS runs in any individual host or device. HIDS monitor only the inbound and outbound packets in the network traffic and when suspicious or harmful activities are identified it sends the alert to the administrator [6]. Whereas Network-based IDS monitors, capture and analyze the data packets in the network traffic. A typical network-based IDS makes use of Signature detection and Anomaly detection. Signature-based IDS are designed to detect only known attacks and it uses a database of known attack signatures which is developed by the experts or intrusion analysts. The Signature detection monitors the packets in the network and compared them to the known signature or entries in this database. If there is a match, the IDS generates an alert message. Anomaly-based IDS looks for the kinds of unknown attacks that signature-based IDS, find hard to detect, and they function on the assumption that attacks are different from “normal” activity and can, therefore, be detected by the systems. This research paper is organized as: Section 2 gives a brief Literature Review, Section 3 explains the Research Methodology, Section 4 shows the Experimental Results using the NSL-KDD dataset and Section 5 includes the Conclusion and Scope for future work. 2. LITERATURE REVIEW A review of different Machine Learning techniques in the field of intrusion detection systems from the past few years is presented as under. S. Revathi et.al. [10] published a paper on detailed analysis on the various intrusion dataset i.e. DARPA98, KDD-cup99, and NSL-KDD. They focused on the NSL-KDD dataset which contains only selected records, and those selected records provide a good analysis of various machine learning techniques for intrusion detection. NSL-KDD improves the accuracy of the system and reduces the false positive rate compared to that of DARPA98 and KDD99. S. Taruna R et.al. [8] proposed a new method of Naïve Bayes Algorithm i.e. Enhanced Naïve Bayes. The results showed that the proposed algorithm more efficiently detects the intrusions, compared to the neural network and it also improved the detection rate and reduces the false positive rate. The experimentation was performed using the KDD- cup99 dataset.