A Fast Two-Stage Classification Method of
Support Vector Machines
Jin Chen, Cheng Wang, Member, IEEE, and Runsheng Wang
ATR Laboratory, School of Electronic Science and Engineering
National University of Defense Technology
47 Yanwachi, Changsha 410073, China
chenjin_wonder@hotmail.com
Abstract-Classification of high-dimensional data generally
requires enormous processing time. In this paper, we present a
fast two-stage method of support vector machines, which includes
a feature reduction algorithm and a fast multiclass method. First,
principal component analysis is applied to the data for feature
reduction and decorrelation, and then a feature selection method
is used to further reduce feature dimensionality. The criterion
based on Bhattacharyya distance is revised to get rid of influence
of some binary problems with large distance. Moreover, a simple
method is proposed to reduce the processing time of multiclass
problems, where one binary SVM with the fewest support vectors
(SVs) will be selected iteratively to exclude the less similar class
until the final result is obtained. Experimented with the
hyperspectral data 92AV3C, the results demonstrate that the
proposed method can achieve a much faster classification and
preserve the high classification accuracy of SVMs.
I. INTRODUCTION
Pattern classification is important due to emerging
applications such as hyperspectral classification, protein
classification, speech recognition, and so on. Compared to
traditional classification approaches, support vector machines
(SVMs) have been found to be particularly promising because
of its lower sensitivity to the curse of dimensionality [1]. The
high generalization ability of SVMs is ensured by special
properties of the optimal hyperplane that maximizes the
distance to training examples in a high dimensional feature
space [2]. Another important property is their good
generalization capability supported by their sparse
representation of the decision function.
However, in many applications, data are represented by high
dimensional feature vectors and a large number of classes.
Both situations increase the computational complexity of test
phase of SVMs. As a result, it seems that, for such
classification problems, SVMs may not be comparable to
traditional classifiers, such as maximal likelihood classification
(MLC) method, in terms of test time. In the literature,
dimensionality reduction is motivated mainly by the
consideration of classification speed [3].
Dimensionality reduction mainly consists of feature selection
and feature extraction approaches. Feature selection methods
can be further classified into two categories: filter and wrapper
methods [4]. The filter method employs intrinsic properties of
data such as Mahalanobis class separability measure as the
criterion, while the wrapper method evaluates feature subsets
based on the performance of the classifier such as classification
error rate. Feature extraction methods mainly including
principal component analysis (PCA), independent component
analysis (ICA), and kernel principal component analysis
(KPCA), a comparison of these methods for dimensionality
reduction in SVMs can be see in [5].
SVMs were originally designed for binary classification.
One-against-all (OAA) [6] and one-against-one (OAO) [7] [8]
are the two most common methods to address the multiclass
classification problem. The discrimination of OAA between an
information class and all others often leads to the estimation of
complex discriminant functions [9]. OAO needs C(C-1)/2
binary SVMs for one classification, which may result in slow
classification.
To obtain a faster classification, direct acyclic graph SVM
(DAGSVM) [10] and binary tree of SVM (BTS) [11] were
proposed to reduce the number of binary SVMs of OAO.
DAGSVM only needs C-1 binary SVMs, and BTS needs
4/3
log (( 3) / 4) C + binary SVMs on average for one
classification. There are also other multiclass SVM methods,
which try to achieve higher classification accuracy, such as
pairwise decision tree of SVM (PDTSVM) [12] and error
correcting output codes (ECOC) methods [13]-[15]. PDTSVM
selects binary SVMs with larger geometric margin and reduces
the layers to decrease the accumulated errors, while ECOC
methods use the error correcting coding theory to improve the
decision accuracy.
Besides, reduced set methods [16], which try to approximate
the original solution by a much smaller number of newly
constructed support vectors (SVs), were also proposed to
obtain a fast classification of SVMs.
In this paper, we propose a fast two-stage method for
classification with SVMs, depicted in Fig. 1. First, it is carried
out by a feature selection algorithm after decorrelation with
principal component analysis (PCA). For the feature selection,
we revised the criterion based on Bhattacharyya distance to get
rid of influence of some binary problems with large distance.
In order to further reduce the computation complexity, a simple
method called fast OAO (FOAO) is proposed to combine C-1
binary SVMs with the fewest support vectors. Experimented
on an Airborne Visible/Infrared Imaging Spectrometer
(AVIRIS) data set, the results demonstrate that the proposed
method can be much faster than different multiclass SVM
978-1-4244-2184-8/08/$25.00 © 2008 IEEE. 869
Proceedings of the 2008 IEEE
International Conference on Information and Automation
June 20 -23, 2008, Zhangjiajie, China