Selection of Candidate Support Vectors in
incremental SVM for network intrusion detection
*
Roshan Chitrakar
*
, Chuanhe Huang
School of Computer, Wuhan University, Wuhan, Hubei, China
article info
Article history:
Received 24 September 2013
Received in revised form
25 April 2014
Accepted 10 June 2014
Available online 19 June 2014
Keywords:
Incremental support vector ma-
chine
KarusheKuhneTucker condition
Candidate Support Vector
Half-partition strategy
Network intrusion detection
abstract
In an Incremental Support Vector Machine classification, the data objects labelled as non-
support vectors by the previous classification are re-used as training data in the next
classification along with new data samples verified by KarusheKuhneTucker (KKT) con-
dition. This paper proposes Half-partition strategy of selecting and retaining non-support
vectors of the current increment of classification e named as Candidate Support Vectors
(CSV) e which are likely to become support vectors in the next increment of classification.
This research work also designs an algorithm named the Candidate Support Vector based
Incremental SVM (CSV-ISVM) algorithm that implements the proposed strategy and ma-
terializes the whole process of incremental SVM classification. This work also suggests
modifications to the previously proposed concentric-ring method and reserved set strat-
egy. Performance of the proposed method is evaluated with experiments and also by
comparing it with other ISVM techniques. Experimental results and performance analyses
show that the proposed algorithm CSV-ISVM is better than general ISVM classifications for
real-time network intrusion detection.
© 2014 Elsevier Ltd. All rights reserved.
1. Introduction
Network intrusion detection is also considered as a pattern
recognition problem of classifying the network traffic pat-
terns into two classes e normal and abnormal; according to
the similarity between them. Nowadays, in the field of
intrusion detection, Support Vector Machine (SVM) is
becoming a popular classification tool based on statistical
machine learning (Mohammad et al., 2011). There are two
issues in machine learning e training of large-scale data sets
and availability of a complete data set (Le and Nguyen, 2011;
Du et al., 2009a,b). Computer's memory will not be enough
and training time will be too long if training data set is very
large. Next, when we capture data packets from a stream of
a network, we cannot obtain the complete network infor-
mation in the very first time and hence a continuous online
learning is required for high learning precision with
increasing number of samples. The challenge of incremental
learning is to decide what and how much information from
the previous learning should be selected for training in the
*
This work is supported by the National Science Foundation of China (No. 61373040, No. 61173137), The Ph.D. Programs
Foundation of Ministry of Education of China (20120141110073), Key Project of Natural Science Foundation of Hubei Province (No.
2010CDA004).
* Corresponding author.
E-mail addresses: roshanchi@gmail.com, roshanchi@whu.edu.cn (R. Chitrakar), huangch@whu.edu.cn (C. Huang).
Available online at www.sciencedirect.com
ScienceDirect
journal homepage: www.elsevier.com/locate/cose
computers & security 45 (2014) 231 e241
http://dx.doi.org/10.1016/j.cose.2014.06.006
0167-4048/© 2014 Elsevier Ltd. All rights reserved.