A novel feature selection approach for intrusion detection data classiﬁcation Mohammed A. Ambusaidi, Xiangjian He * , Zhiyuan Tan, Priyadarsi Nanda, Liang Fu Lu and Upasana T.Nagar Center for Innovation in IT Services and Applications (iNEXT) School of Computing and Communications, Faculty of Engineering and IT, University of Technology, Sydney, Australia {Mohammed.A.AmbuSaidi, Upasana.T.Nagar}@student.uts.edu.au, {Xiangjian.He, Zhiyuan.Tan, Priyadarsi.Nanda} @uts.edu.au Abstract—Intrusion Detection Systems (IDSs) play a signiﬁ- cant role in monitoring and analyzing daily activities occurring in computer systems to detect occurrences of security threats. However, the routinely produced analytical data from computer networks are usually of very huge in size. This creates a major challenge to IDSs, which need to examine all features in the data to identify intrusive patterns. The objective of this study is to analyze and select the more discriminate input features for building computationally efﬁcient and effective schemes for an IDS. For this, a hybrid feature selection algorithm in combination with wrapper and ﬁlter selection processes is designed in this paper. Two main phases are involved in this algorithm. The upper phase conducts a preliminary search for an optimal subset of features, in which the mutual information between the input features and the output class serves as a determinant criterion. The selected set of features from the previous phase is further reﬁned in the lower phase in a wrapper manner, in which the Least Square Support Vector Machine (LSSVM) is used to guide the selection process and retain optimized set of features. The efﬁciency and effectiveness of our approach is demonstrated through building an IDS and a fair comparison with other state- of-the-art detection approaches. The experimental results show that our hybrid model is promising in detection compared to the previously reported results. Keywords—Intrusion detection, Feature selection, Mutual in- formation, Least square support vector machines, Floating search. I. I NTRODUCTION Intrusion detection is the art of discovering and detecting network trafﬁc patterns that are anomalous to the normal network trafﬁc. Today, intrusion detection is considered as one of the most priority and challenging tasks for network security administrators. More sophisticated inﬁltration techniques have been developed by attackers to challenge and defeat the security tools [1]. Thus, there is a need for an efﬁcient and reliable IDS to safeguard computer networks from known as well as unknown vulnerabilities. The primary purpose of these systems is to be accurate in detecting attacks with minimum false alarms. However, to fulﬁll this purpose, an IDS should be able to handle huge amount of network data and fast enough to make real time decisions. In general an IDS deals with large volume of data consist- ing of variety of trafﬁc patterns. Each pattern in a dataset is characterized by a set of features (or attributes) and represents a point in a multi-dimensional feature space. A pattern might contain irrelevant and redundant features slowing down the training and testing processes or even affect the classiﬁcation performance with more mathematical complexity. However, in practice, it is worthwhile to keep the number of features as small as possible in order to reduce the computational cost and the complexity of building a classiﬁer. In addition, eliminating unimportant features facilitates data visualization, improves modelling, prediction performance, and speeds up classiﬁcation process. Thus, dimensionality reduction, such as feature extraction and feature selection, has been successfully applied to machine learning and data mining to solve this problem. Feature extraction techniques attempt to transfer the input features into a new feature set, while Feature Selection (FS) algorithms search for the most informative features from the original input data [2]. In this paper, we focus on feature selection and propose a scheme that selects features based on the principle of Mutual Information (MI) for feature ranking. The best set of candidate features is chosen, in a wrapper manner, from the top of the ranking list by looking for the best subset that produces the highest classiﬁcation accuracy. The proposed approach is a combination of two main stages: (1) ﬁlter feature ranking; and (2) wrapper-based Improved Forward Floating Selection (IFFS) using LS-SVM and classiﬁcation accuracy. The ﬁlter method aims to reduce the computational cost of the wrapper search by eliminating irrelevant and redundancy features from the initial feature set. The wrapper method-based IFFS is used to search for a proper subset that improves the classiﬁcation accuracy. The aim is to achieve both the high accuracy of wrap- per approaches and the efﬁciency of ﬁlter approaches. Finally, in order to examine the effectiveness of our proposed feature selection method, the ﬁnal subset is then passed through LS- SVM classiﬁer to build an IDS. Experimental results presented for validation obtained using different sets of KDD Cup 99 data, are commonly used in literature. This paper is organized as follows: Section II outlines the related work of this study. Section III describes the concept of mutual information and its estimation. Section IV describes the principle of improved forward ﬂoating selection algorithm. Section V introduces our proposed hybrid feature selection algorithm. Section VI details our detection framework showing the different detection stages. Section VII presents the experi- mental details and results. Finally, we conclude this paper by summarizing the work and future works in Section VIII.