Classifying Network Attack Types with Machine Learning Approach Naruemon Wattanapongsakorn 1 , Phurivit Sangkatsanee 1 , Sanan Srakaew 1 , Chalermpol Charnsripinyo 2 1 Department of Computer Engineering, King Mongkut’s University of Technology Thonburi, Bangkok, Thailand 2 National Electronics and Computer Technology Center, Pratumthani, Thailand Abstract- The growing rate of network attacks including hacker, cracker, and criminal enterprises have been increasing, which impact to the availability, confidentiality, and integrity of critical information data. In this paper, we propose a network-based Intrusion Detection and Classification System (IDCS) using well- known machine learning technique to classify an online network data that is preprocessed to have only 12 features. The number of features affects to the detection speed and resource consumption. Unlike other intrusion detection approaches where a few attack types are classified, our IDCS can classify normal network activities and identify 17 different attack types. Hence, our detection and classification approach can greatly reduce time to diagnose and prevent the network attacks. I. INTRODUCTION Traditional intrusion detection approaches such as firewalls or encryption are not sufficient to prevent network from all attack types. Network intrusion detection system is additionally used as a prevention tool that inspects antagonistic activities and gives an alarm signal to the computer user or network administrator for hazardous network activity on the opening session. In the past, there were research papers proposing intrusion detection systems with different techniques and various classification algorithms such as Adaptive Resonance Theory (ART), Self-Organizing Map (SOM), Back-Propagation (Back-Prop) Neural Network, statistical probability distribution, BLINd classification and Bayesian [1-8]. Most of them used KDD99 dataset to evaluate their IDS performance. Note that the KDD99 dataset is a 10 years-old off-line data consisting of 41 features. In [6], K. Labib and R. Vemuri used Self-Organizing Map to classify normal data and DoS attack with 10 features of every 50 packets evaluated by different characteristic visualization of normal and DoS. In [7], Ricardo S. Puttini, et al., used Bayesian Classification model to classify normal and attack with 3-month training dataset and one-month of testing dataset evaluated by detection penalty. In [8], M. Amini, et al., used Adaptive Resonance Theory (ART) and Self-Organizing Map (SOM) by considering about 5000 packets for training and 3000 packets for testing. The sampling data are obtained during 4-day experiment, where 27 features of data are determined by frequency of occurrences in a specified interval. The detection rates (including attack and normal data) of the ART and the SOM methods are approximately 97% and 95%, respectively. In this paper, we propose a network-based intrusion detection and classification system (IDCS) using a machine learning approach to classify online (real-time) network data. We consider only 12 features of network traffic data which are effective to detect and classify 17 attack types of Probing and Denial of Services, as well as normal network activity. Various well-known machine learning techniques can be used in our detection and classification approach. The advantage of IDCS can greatly reduce the time for network administrators/users to analyze network data and protect the network from illicit attacks. The rest of this paper is organized as follows. Section II presents our research methodology with machine learning techniques and our proposed intrusion detection and classification model. Section III presents experimental results and analysis. Lastly, section IV gives conclusion of this research work. II. RESEARCH METHODOLOGY A. Machine Learning Techniques Many machine learning techniques are available for data classification. In this research, we consider several well- known techniques which are Decision Tree [10, 13], Ripple Rule, Back-Propagation neural network and Bayesian network. Each of them is a supervised learning technique so that it has to be trained/learned with known input dataset (or training dataset) prior to classifying or detecting a new or unknown data. Nevertheless, these machine learning techniques can give high detection accuracy for our IDCS approach as will be shown in the experimental section. B. Intrusion Detection and Classification Model Our IDCS model as shown in Figure 1 mainly consists of the preprocessing part, and the classifying part. At first, the IDCS receives online network data packet that entering the preprocessing part where the packet header and other detailed data are considered. The detailed packet data features are then generated to numeral which is frequency of occurrence in a specified time interval, which is 2 seconds in our experiment. The essential feature which represents the network activity is extracted from this data. Then the preprocessed data with key signature extraction is ready to enter the classifying part so that the IDCS can classify the data into normal network activity and attack types. - 98 -