International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 07 Issue: 09 | Sep 2020 www.irjet.net p-ISSN: 2395-0072
© 2020, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 2802
Performance Analysis of Different Machine Learning Techniques
for Anomaly-based Intrusion Detection
Ambreen Sabha
1
, Lalit Sen Sharma
2
1
M.Tech. Student, Department of Computer Science and IT, University of Jammu, J&K, India
2
Professor, Department of Computer Science and IT, University of Jammu, J&K, India
---------------------------------------------------------------------***----------------------------------------------------------------------
Abstract - An Intrusion is an activity that compromises the
confidentiality or the availability of the resource. An Intrusion
Detection System is a device or the software that monitors the
state of the network for any unauthorized access or any policy
violations. The objective of the current research work is to
compare the performance between different machine learning
techniques using an anomaly-based intrusion dataset. For the
proposed study, three supervised machine learning techniques
namely Naïve Bayes, Decision Tree, and Random Forest have
been applied to the dataset. To assess the performance of each
machine learning technique; four parameters namely
accuracy, recall, precision, and f-score have been evaluated.
Experimentation is performed on the NSL-KDD dataset, which
is based on the different sets of features. The detection
accuracy and the execution time taken by the machine
learning algorithms are analyzed. Random Forest obtained
the highest accuracy of 97.8% and execution time of 0.998
milliseconds compared to that of the Decision Tree and Naïve
Bayes. The detection accuracy of all the four attacks which
were present in the dataset is DoS 99%, Probe 99%, R2L 98%,
and U2R 99%, using the proposed research machine learning
algorithm as Random Forest.
Key Words: Decision Tree, IDS, Machine Learning, NSL-
KDD, Naïve Bayes, Network Security, Random Forest.
1. INTRODUCTION
The Intrusion detection system (IDS) is a device or software
which monitors the network for any malicious activity. An
IDS is a tool that works with the network to keep it secure
and alert when somebody is trying to break into your system
[6]. Intrusion detection is the problem of identifying
unauthorized use and abuse of computer systems by system
insiders and external intruders and it is the process of
detecting malicious patterns in the large data sets. Intrusion
detection systems is classified into two different categories
as Host-based intrusion detection system i.e. HIDS and
Network-based intrusion detection system i.e. NIDS. Host-
based IDS runs in any individual host or device. HIDS
monitor only the inbound and outbound packets in the
network traffic and when suspicious or harmful activities are
identified it sends the alert to the administrator [6]. Whereas
Network-based IDS monitors, capture and analyze the data
packets in the network traffic. A typical network-based IDS
makes use of Signature detection and Anomaly detection.
Signature-based IDS are designed to detect only known
attacks and it uses a database of known attack signatures
which is developed by the experts or intrusion analysts. The
Signature detection monitors the packets in the network and
compared them to the known signature or entries in this
database. If there is a match, the IDS generates an alert
message. Anomaly-based IDS looks for the kinds of
unknown attacks that signature-based IDS, find hard to
detect, and they function on the assumption that attacks are
different from “normal” activity and can, therefore, be
detected by the systems.
This research paper is organized as: Section 2 gives a brief
Literature Review, Section 3 explains the Research
Methodology, Section 4 shows the Experimental Results
using the NSL-KDD dataset and Section 5 includes the
Conclusion and Scope for future work.
2. LITERATURE REVIEW
A review of different Machine Learning techniques in the
field of intrusion detection systems from the past few years
is presented as under.
S. Revathi et.al. [10] published a paper on detailed analysis
on the various intrusion dataset i.e. DARPA98, KDD-cup99,
and NSL-KDD. They focused on the NSL-KDD dataset which
contains only selected records, and those selected records
provide a good analysis of various machine learning
techniques for intrusion detection. NSL-KDD improves the
accuracy of the system and reduces the false positive rate
compared to that of DARPA98 and KDD99.
S. Taruna R et.al. [8] proposed a new method of Naïve Bayes
Algorithm i.e. Enhanced Naïve Bayes. The results showed
that the proposed algorithm more efficiently detects the
intrusions, compared to the neural network and it also
improved the detection rate and reduces the false positive
rate. The experimentation was performed using the KDD-
cup99 dataset.