2022 IEEE NIGERCON
978-1-6654-7978-3/22/$31.00 ©2022 IEEE
Combating Network Intrusions using Machine
Learning Techniques with Multilevel Feature
Selection Method
Tosin Comfort OLAYINKA
Department of Computer Science
Wellspring University,
Benin City, Edo Sate Nigeria
tcolayinka@gmail.com
Adebayọ Olusọla ADETUNMBI
Department of Computer Science
Federal University of Technology
Akure, Nigeria
aoadetunmbi@futa.edu.ng
Chukwuemeka Christian UGWU
Department of Computer Science
Federal University of Technology
Akure, Nigeria
ugwucc@futa.edu.ng
Olugbemiga Solomon POPOỌLA
Department of Computer Science
Ọsun State College of Education
Ila-Ọrangun, Nigeria
popsol7@yahoo.com
Omoibu Joseph OKHUOYA
ICT/CRPU International Training
Center
University of Benin
Benin City, Nigeria
joseph.okhuoya@uniben.edu.ng
Abstract— The heavy dependency on the internet, as well as
other emerging technologies for access, storage, and sharing of
information, has triggered a proportional increase in
cyberattacks, thereby making network intrusion detection
system (NIDS) a crucial component in security systems. NIDS
is employed to monitor abnormal activities on a network.
However, issues of low accuracy and high false positive remain
prevalent among NIDSs. In an attempt to improve the
performance in the prediction of network intrusions, this paper
applied in parallel, four (4) machine learning models: k-Nearest
Neighbor (k-NN), Naïve Bayes (NB), Logistic Regression (LR),
and Artificial Neural Network (ANN) with multilevel feature
selection method to determine which of the models has the best
detection capability in terms of Accuracy, Positive Predicted
Values (PPV), Recall, F1-score, and Receiver Operating
Characteristics (ROC) Curve. The models were validated on
NSL-KDD intrusion data and the result shows k-NN had the
best performance with an accuracy of 79.1%, recall of 66.5%,
positive predicted values of 96.7%, and F1-measure of 78.1%.
Keywords— Intrusion Detection, Classification, Machine
Learning, Network Traffic, Feature Selection, Anomaly.
I. INTRODUCTION
The prevalent use of computers and networks in recent
years and the development of new technologies such as
Internet of things (IoT), cloud computing, and big data among
others, have presented serious security concerns for both
corporate and social networks [1]. New attack types with
complex attack strategies are emerging on daily basis, and
their effect on networks poses a challenge to information
technology (IT) security experts equipped with traditional
defense mechanisms.
Nowadays, network securities are strengthened using
multiple defensive tools; and in most of these cases, intrusion
detection systems (IDSs) are used as complementary defense
tools. IDS are either software or device tools that monitor
activities in a network or device system for unusual or
malicious occurrences. Network intrusion detection system
(NIDS) collects information from several key nodes in the
computer network system, checks whether there are any
violations of security policies and signs of an attack in the
network, identifies threats in the network, and generates
alarms, to provide real-time protection for internal attacks,
external attacks, and mis-operations [2]. The two main
detection methods for NIDS are misuse and anomaly method.
The misuse method is best used for detecting known attacks
but suffers performance degradation when attacks are
unknown. Anomaly-based methods on other hand are
suitable for unknown attack detection but highly susceptible
to false positives [3], [4]. Several works have been done to
improve NIDS performance. Some of which include
hybridization of signature and anomaly-based approach [5],
the inclusion of false alarm filter to both signature and
anomaly NIDS, and the latest is the application of machine
learning (ML) algorithms.
The need for ML arose due to several challenges faced by
existing NIDSs such as processing large volumes of data, low
accuracy detection, and high false positive rate. ML learns
useful patterns from existing data as a reference for
normal/attack traffic behaviour profiles for subsequent
classification of network traffic [4]. ML methods proposed
for different intrusion detection problems are broadly
classified into supervised and unsupervised detection. The
unsupervised IDS learn patterns of possible network
intrusions from unlabeled training data [6], while supervised
models detect possible intrusions by training on already
labelled intrusion datasets. Creating supervised training
samples for ML-IDS might be a little challenging, but the
result is highly accurate and reliable, which makes it popular
among intrusion detection experts.
This paper centers on supervised ML-NIDS as we aim to
analyze k-NN, Naïve Bayes, ANN, and Logistic Regression
on one of the commonly used network intrusion datasets
(NSL-KDD) based on standard performance metrics such as
accuracy, recall, precision, among others. The organization
of the remaining sections of this paper is as follows: Section
II shows the appraisal of related works. The system
architecture, methods, and materials are discussed in Section
2022 IEEE Nigeria 4th International Conference on Disruptive Technologies for Sustainable Development (NIGERCON) | 978-1-6654-7978-3/22/$31.00 ©2022 IEEE | DOI: 10.1109/NIGERCON54645.2022.9803098
Authorized licensed use limited to: UNIVERSITY OF COLORADO. Downloaded on September 17,2022 at 06:43:21 UTC from IEEE Xplore. Restrictions apply.