Software Description
A MATLAB toolbox for class modeling using one-class partial least
squares (OCPLS) classifiers
Lu Xu
a,
⁎, Mohammad Goodarzi
b
, Wei Shi
a
, Chen-Bo Cai
c,
⁎, Jian-Hui Jiang
d,
⁎
a
College of Material and Chemical Engineering, Tongren University, Tongren, 554300, Guizhou, PR China
b
Department of Biosystems, Faculty of Bioscience Engineering, Katholieke Universiteit Leuven, Kasteelpark Arenberg 30, B-3001, Leuven, Belgium
c
College of Chemistry and Life Science, Chuxiong Normal University, Chuxiong 675000, PR China
d
State Key Laboratory of Chemo/BioSensing and Chemometrics, College of Chemistry and Chemical Engineering, Hunan University, Changsha 410082, PR China
abstract article info
Article history:
Received 25 June 2014
Received in revised form 12 September 2014
Accepted 16 September 2014
Available online 23 September 2014
Keywords:
MATLAB toolbox
Class modeling
One-class partial least squares
(OCPLS) classifiers
Nonlinear and robust algorithms
Fault diagnosis
One-class classifiers are widely used to solve the classification problems where control or class modeling of a
target class is necessary, e.g., untargeted analysis of food adulterations and frauds, tracing the origins of a food
with Protected Denomination of Origin, fault diagnosis, etc. Recently, one-class partial least squares (OCPLS)
has been developed and demonstrated to be a useful technique for class modeling. For analysis of nonlinear
and outlier-contaminated data, nonlinear and robust OCPLS algorithms are required.
This paper describes a free MATLAB toolbox for class modeling using OCPLS classifiers. The toolbox includes
ordinary, nonlinear and robust OCPLS methods. The nonlinear algorithm is based on the Gaussian radial basis
function (GRBF), and the robust algorithm is based on the partial robust M-regression (PRM). The usage of the
toolbox is demonstrated by analysis of a real data set.
© 2014 Elsevier B.V. All rights reserved.
1. Introduction
It was recognized by one of the founders of chemometrics, Kowalski,
and Bender [1], that “whenever something must be learned from ob-
jects (elements, compounds, and mixtures) and a chemical/physical
theory has not been sufficiently developed, pattern recognition may
provide a solution.” The above viewpoint has been fully proved by
various applications of pattern recognition techniques in order to un-
derstand complex objects in chemistry [2–4]. Besides the commonly
used multi-class classification or discriminant analysis (DA) techniques,
recently the so-called one-class classifiers [5–7] or class modeling tech-
niques (CMTs) [8–16] have attracted much attention and the difference
between DA and CMTs has been discussed by some authors [7,13–15].
While DA aims at classifying two or more predefined classes [17],
CMTs are especially useful when it is necessary to define or model the
range of a target class. Some typical problems which require the use of
CMTs include untargeted detection of food adulterations or frauds, trac-
ing the geographical origins of protected denomination of origin (PDO)
foods [18,19], fault diagnosis, etc.
Some usually used CMTs may include the following: (1) soft inde-
pendent modeling of class analogy (SIMCA) [8] using principal compo-
nent analysis (PCA); (2) unequal dispersed classes (UNEQ) [9,10] based
on the hypothesis of multivariate normal distribution and the
Hotelling's T
2
test; (3) potential functions methods [20] by estimation
of the multivariate probability distribution; and (4) those based on
artificial neural networks (ANNs) and support vector machines
(SVMs) [5,21,22]. The most popular CMTs are SIMCA and PCA-related
techniques [23–25], which is especially useful in chemometrics by
extracting a few of primary and informative components or latent
variables (LVs).
Partial least squares (PLS), as one of the cornerstones of
chemometrics, has been widely used to solve both regression and clas-
sification problems. The rationale of PLS-DA has been demonstrated by
the relationship among PLS, canonical correlation analysis (CCA) and
linear discriminant analysis (LDA) [26]. Recently, one-class partial
least squares (OCPLS) or PLS class model (PLSCM) [27] has been pro-
posed and demonstrated to be an effective tool for class modeling.
Unlike SIMCA, whose components explain most of the data variances,
OCPLS components consider simultaneously the explained variances
and the compactness of the target class. Moreover, OCPLS can be
performed as a special PLS regression and works in the framework of
Multivariate Calibration.
Practical data analysis sometimes encounters nonlinear and outlier-
contaminated data sets, which would cause bias or even breakdown in
estimation of OCPLS parameters. Therefore, it is necessary to develop
nonlinear and robust algorithms for OCPLS. This paper describes a free
MATLAB toolbox for OCPLS, including the ordinary linear, nonlinear
Gaussian radial basis function (RBF) OCPLS (GRBF-OCPLS) and robust
Chemometrics and Intelligent Laboratory Systems 139 (2014) 58–63
⁎ Corresponding authors. Tel.: +86 856 5222556; fax: +86 856 5230977.
E-mail addresses: lxchemo@163.com (L. Xu), ccp66516@163.com (C.-B. Cai),
jianhuijiang@hnu.edu.cn (J.-H. Jiang).
http://dx.doi.org/10.1016/j.chemolab.2014.09.005
0169-7439/© 2014 Elsevier B.V. All rights reserved.
Contents lists available at ScienceDirect
Chemometrics and Intelligent Laboratory Systems
journal homepage: www.elsevier.com/locate/chemolab