A novel feature selection approach for intrusion detection data classification Mohammed A. Ambusaidi, Xiangjian He * , Zhiyuan Tan, Priyadarsi Nanda, Liang Fu Lu and Upasana T.Nagar Center for Innovation in IT Services and Applications (iNEXT) School of Computing and Communications, Faculty of Engineering and IT, University of Technology, Sydney, Australia {Mohammed.A.AmbuSaidi, Upasana.T.Nagar}@student.uts.edu.au, {Xiangjian.He, Zhiyuan.Tan, Priyadarsi.Nanda} @uts.edu.au Abstract—Intrusion Detection Systems (IDSs) play a signifi- cant role in monitoring and analyzing daily activities occurring in computer systems to detect occurrences of security threats. However, the routinely produced analytical data from computer networks are usually of very huge in size. This creates a major challenge to IDSs, which need to examine all features in the data to identify intrusive patterns. The objective of this study is to analyze and select the more discriminate input features for building computationally efficient and effective schemes for an IDS. For this, a hybrid feature selection algorithm in combination with wrapper and filter selection processes is designed in this paper. Two main phases are involved in this algorithm. The upper phase conducts a preliminary search for an optimal subset of features, in which the mutual information between the input features and the output class serves as a determinant criterion. The selected set of features from the previous phase is further refined in the lower phase in a wrapper manner, in which the Least Square Support Vector Machine (LSSVM) is used to guide the selection process and retain optimized set of features. The efficiency and effectiveness of our approach is demonstrated through building an IDS and a fair comparison with other state- of-the-art detection approaches. The experimental results show that our hybrid model is promising in detection compared to the previously reported results. KeywordsIntrusion detection, Feature selection, Mutual in- formation, Least square support vector machines, Floating search. I. I NTRODUCTION Intrusion detection is the art of discovering and detecting network traffic patterns that are anomalous to the normal network traffic. Today, intrusion detection is considered as one of the most priority and challenging tasks for network security administrators. More sophisticated infiltration techniques have been developed by attackers to challenge and defeat the security tools [1]. Thus, there is a need for an efficient and reliable IDS to safeguard computer networks from known as well as unknown vulnerabilities. The primary purpose of these systems is to be accurate in detecting attacks with minimum false alarms. However, to fulfill this purpose, an IDS should be able to handle huge amount of network data and fast enough to make real time decisions. In general an IDS deals with large volume of data consist- ing of variety of traffic patterns. Each pattern in a dataset is characterized by a set of features (or attributes) and represents a point in a multi-dimensional feature space. A pattern might contain irrelevant and redundant features slowing down the training and testing processes or even affect the classification performance with more mathematical complexity. However, in practice, it is worthwhile to keep the number of features as small as possible in order to reduce the computational cost and the complexity of building a classifier. In addition, eliminating unimportant features facilitates data visualization, improves modelling, prediction performance, and speeds up classification process. Thus, dimensionality reduction, such as feature extraction and feature selection, has been successfully applied to machine learning and data mining to solve this problem. Feature extraction techniques attempt to transfer the input features into a new feature set, while Feature Selection (FS) algorithms search for the most informative features from the original input data [2]. In this paper, we focus on feature selection and propose a scheme that selects features based on the principle of Mutual Information (MI) for feature ranking. The best set of candidate features is chosen, in a wrapper manner, from the top of the ranking list by looking for the best subset that produces the highest classification accuracy. The proposed approach is a combination of two main stages: (1) filter feature ranking; and (2) wrapper-based Improved Forward Floating Selection (IFFS) using LS-SVM and classification accuracy. The filter method aims to reduce the computational cost of the wrapper search by eliminating irrelevant and redundancy features from the initial feature set. The wrapper method-based IFFS is used to search for a proper subset that improves the classification accuracy. The aim is to achieve both the high accuracy of wrap- per approaches and the efficiency of filter approaches. Finally, in order to examine the effectiveness of our proposed feature selection method, the final subset is then passed through LS- SVM classifier to build an IDS. Experimental results presented for validation obtained using different sets of KDD Cup 99 data, are commonly used in literature. This paper is organized as follows: Section II outlines the related work of this study. Section III describes the concept of mutual information and its estimation. Section IV describes the principle of improved forward floating selection algorithm. Section V introduces our proposed hybrid feature selection algorithm. Section VI details our detection framework showing the different detection stages. Section VII presents the experi- mental details and results. Finally, we conclude this paper by summarizing the work and future works in Section VIII.