Analytica Chimica Acta 664 (2010) 27–33
Contents lists available at ScienceDirect
Analytica Chimica Acta
journal homepage: www.elsevier.com/locate/aca
Multi-class classification with probabilistic discriminant partial least squares
(p-DPLS)
Néstor F. Pérez, Joan Ferré
∗
, Ricard Boqué
Department of Analytical Chemistry and Organic Chemistry, Rovira and Virgili University, C/Marcel·lí Domingo, s/n. 43007, Tarragona, Spain
article info
Article history:
Received 19 October 2009
Received in revised form 22 January 2010
Accepted 29 January 2010
Available online 6 February 2010
Keywords:
Reliability
Multi-class classification
Discriminant partial least squares
Probabilistic DPLS
abstract
This work describes multi-classification based on binary probabilistic discriminant partial least squares
(p-DPLS) models, developed with the strategy one-against-one and the principle of winner-takes-all. The
multi-classification problem is split into binary classification problems with p-DPLS models. The results
of these models are combined to obtain the final classification result. The classification criterion uses
the specific characteristics of an object (position in the multivariate space and prediction uncertainty)
to estimate the reliability of the classification, so that the object is assigned to the class with the highest
reliability. This new methodology is tested with the well-known Iris data set and a data set of Italian olive
oils. When compared with CART and SIMCA, the proposed method has better average performance of
classification, besides giving a statistic that evaluates the reliability of classification. For the olive oil set
the average percentage of correct classification for the training set was close to 84% with p-DPLS against
75% with CART and 100% with SIMCA, while for the test set the average was close to 94% with p-DPLS as
against 50% with CART and 62% with SIMCA.
© 2010 Elsevier B.V. All rights reserved.
1. Introduction
In multi-class classification problems we have an I × J set X of
J observed variables in I training objects, a vector y that codifies
the class c (c = 1,..., C; with C > 2) of each object and a vector x of
variables measured for the unknown object that must be assigned
to one (or none) of the C possible classes. Examples of multi-class
classification problems are the assignation of food commodities to
one out of several possible origins [1,2] and the identification of
different tumour types from microarray gene expression data [3,4].
A multi-class classification problem is solved by using an ade-
quate classifier decision function that maps x onto a class label [5].
One approach is to use a single classification function like in k-
Nearest Neighbours (k-NN) [6], or Artificial Neural Networks (ANN)
[7]. In these cases, the classification of an object in one of the
C classes is done in one step. Another approach is to divide the
multi-class problem into K smaller classification problems, each
one with its own decision rule, and then combine the output of
the K individual classifications to obtain the final result. This can
be done not only by using single-class models, such as the Soft
Independent Modelling of Class Analogy (SIMCA) method [1], but
also by using binary classification methods that have to decide
between two classes or super-classes. The latter approach is known
as dichotomization or binarization [1] and has the advantage that a
∗
Corresponding author. Tel.: +34 977 55 9564; fax: +34 977 55 8446.
E-mail address: joan.ferre@urv.cat (J. Ferré).
wide range of binary classification methods, such as Support Vector
Machines [5] and discriminant partial least squares (DPLS) [8] can
be used.
There are three possible ways of splitting the classes for binary
classifiers: one-against-all (where “all” means “the rest”), one-
against-one and P-against-Q [9]. In all cases the original vector y
is replaced by another one that codifies with a “1” the objects that
belong to the class or classes of interest, and with a “0” the objects
that do not belong to the classes of interest. The strategy P-against-
Q (PAQ) first splits the data into two groups, one with P classes,
and one with the remaining Q classes. At the next level, the classes
in P are also split into two groups, and a binary model that dis-
criminate between them is calculated. The division is also done for
the classes in Q. The split continues at the successive levels until
models only discriminate between two classes. This procedure can
solve a classification problem of C classes using C-1 binary models
[6]. The drawback of hierarchical PAQ is that an error of allocation
in one node results in the object being misclassified. The strategy
one-against-all (OAA) is similar to PAQ, in which P only contains
one class, and Q contains the remaining C-1 classes. In this case, the
problem is solved either hierarchically (which involves C-1 models)
or by simultaneous combination of C binary models [5]. A weak-
ness of OAA is that the number of objects of the class of interest can
be imbalanced with respect to the other super-class that contains
the rest of the objects. Moreover, incompatible classes (that could
be correctly discriminated if they were modelled one against the
others) are grouped together, thus forcing the model to consider
opposite classes as a unique super-class. Finally, the strategy one-
0003-2670/$ – see front matter © 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.aca.2010.01.059