International Journal of Innovative Technology and Exploring Engineering (IJITEE)
ISSN: 2278-3075, Volume-8 Issue-9, July, 2019
2622
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Retrieval Number: I8982078919/19©BEIESP
DOI:10.35940/ijitee.I8982.078919
Abstract— As the new technologies are emerging, data is
getting generated in larger volumes high dimensions. The high
dimensionality of data may rise to great challenge while
classification. The presence of redundant features and noisy data
degrades the performance of the model. So, it is necessary to
extract the relevant features from given data set. Feature
extraction is an important step in many machine learning
algorithms. Many researchers have been attempted to extract the
features. Among these different feature extraction methods,
mutual information is widely used feature selection method
because of its good quality of quantifying dependency among the
features in classification problems. To cope with this issue, in this
paper we proposed simplified mutual information based feature
selection with less computational overhead. The selected feature
subset is experimented with multilayered perceptron on KDD
CUP 99 data set with 2- class classification, 5-class classification
and 4-class classification. The accuracy is of these models almost
similar with less number of features.
Keywords — IDS, Perceptron, Mutual Information, Entropy,
Conditional Entropy, Feature Selection.
1. INTRODUCTION
Intrusion detection system [3, 19, and 20] dynamically
detects and monitors the activities that occur in the network
and analyzes the malicious activity which violates the
security policy and user security. Intrusion detection is
categorized into misuse and anomaly detection. In misuse
detection the incoming and outgoing packet signatures are
compared against a database of signatures. Anomaly
detection creates a profile for the normal behavior and any
activity that deviates from the profile is considered as an
attack. There is continuous growth in attacks on the network
from the last three decades. The attacks are impacted lot on
the user security. It is difficult to handle these attacks
traditionally. To handle these attacks automatically, lot of
research [1-10] is carried on intrusion detection system
using machine learning. Machine learning algorithm
requires the past data to train the model. The IDS using
machine learning is built on the standard data sets like KDD
CUP 99, NSL-KDD, Kyoto-2006+, ISCX, etc. The KDD
CUP 99 data set is the most popular and standard data set
used in the literature. The data is collected and distributed
by MIT Lincoln laboratory and is sponsored by the Defense
Advanced Research Projects Agency (DARPA) and Air
Force Research Laboratory (AFRL). The KDD CUP 98 and
KDD CUP 99 data sets are a subset of DARPA sponsored
project. The [23] KDD CUP 99 data set contains 41 features
Revised Manuscript Received on July 10, 2019.
V Maheshwar Reddy, Assistant Professor ACE Engineering College,
Telangana, India.
I Ravi Prakash Reddy, Professor, G. Narayanamma Institute of
Technology and Science, Telangana, India.
K Adi Narayana Reddy, Professor, ACE Engineering College,
Telangana, India.
and a class label. The class label is multi-class and it has
five classes namely Normal, DOS, Probe, R2L and U2R.
Feature selection [1, 2, 6, 7, 12, 13] is an important
technique in selecting the subset of important features from
the high dimensional data. This technique extracts relevant
features and removes the redundant features. The feature
selection approaches are categorized into filter based and
wrapper based techniques. The wrapper based technique is
dependent on classification algorithm whereas filter based
technique extracts the subset of features, independent of
classification algorithm. Most of the researchers developed
the IDS models using machine learning algorithms with the
different feature selection technique combinations. The
ranking methodology and SVM are used in [21] as feature
selection and classification algorithm. Similarly GA and
decision tree algorithm in [22], PCA and SVM algorithm in
[4], GA and SVM algorithm in [2] and rough set theory and
SVM with different kernel functions in [14] are used as
feature selection and classification algorithms. The feature
selections techniques like correlation based feature
selection, consistency based filter and INTERACT are
introduced in [3]. The naïve Bayes, tree augmented naïve
Bayes and NBTree are trained on the selected subset of
features. The relevant features are selected using BIRCH
hierarchical clustering algorithm in [6] and in [5] bagging
with REPTree is trained on these selected features. These
feature selection techniques are wrapper based and works
with only the specific classification algorithm. We propose
a feature selection technique based on mutual information.
This technique is a filter based feature selection technique.
In the next section we cover the literature on mutual
information based filtering technique.
The paper is organized as, section two deals with
concepts of entropy, joint entropy, conditional entropy, and
mutual information along literature survey on mutual
information. Section 3 covers the proposed simplified
mutual information based feature selection. Section 4 deals
with experimental setup and results and the final section
concludes the paper.
2. MUTUAL INFORMATION
Mutual Information is originally proposed by Claude E.
Shannon [15, 16] in the year 1948 in his research paper “A
Mathematical Theory of Communications.” Entropy and
conditional entropy are the smallest units of mutual
information. Mutual information measures the dependency
between two variables. The entropy measures the
uncertainty of a random variable. The entropy of a random
Intrusion Detection System using SMIFS and
Multi class Multi layer Perceptron
V Maheshwar Reddy, I Ravi Prakash Reddy, K Adi Narayana Reddy