A chemometrics toolbox based on projections
and latent variables
Lennart Eriksson
a
, Johan Trygg
b
* and Svante Wold
b
A personal view is given about the gradual development of projection methods—also called bilinear, latent variable,
and more—and their use in chemometrics. We start with the principal components analysis (PCA) being the basis for
more elaborate methods for more complex problems such as soft independent modeling of class analogy, partial
least squares (PLS), hierarchical PCA and PLS, PLS-discriminant analysis, Orthogonal projection to latent structures
(OPLS), OPLS-discriminant analysis and more.
From its start around 1970, this development was strongly influenced by Bruce Kowalski and his group in Seattle,
and his realization that the multidimensional data profiles emerging from spectrometers, chromatographs, and
other electronic instruments, contained interesting information that was not recognized by the current one variable
at a time approaches to chemical data analysis.
This led to the adoption of what in statistics is called the data analytical approach, often called also the data driven
approach, soft modeling, and more. This approach combined with PCA and later PLS, turned out to work very well in
the analysis of chemical data. This because of the close correspondence between, on the one hand, the matrix de-
composition at the heart of PCA and PLS and, on the other hand, the analogy concept on which so much of chemical
theory and experimentation are based. This extends to numerical and conceptual stability and good approximation
properties of these models.
The development is informally summarized and described and illustrated by a few examples and anecdotes.
Copyright © 2014 John Wiley & Sons, Ltd.
Keywords: chemometrics; latent variables; projection methods; PLS; OPLS
MY (SVANTE WOLD) MEMORIES OF BRUCE
In 1972–1973, I read and reread and reread the two articles by
Kowalski and Bender in the Journal of the American Chemical
Society on “pattern recognition in chemical data” [1,2]. They
were a revelation. Suddenly, I was not alone in my feelings about
the state of chemistry and the immense unexplored possibilities
provided by multidimensional chemical instrumentation, statis-
tics/applied math, and computers. And there was someone out
in the west of the USA who expressed this in a much more clear
and convincing way than I would ever manage. I was thrilled. But
how could I get in contact with this prophet in Seattle? He must
be an old wise professor with a long white beard who would not
even look at a letter from a young postdoc at an unimportant
university in northern Sweden.
By a remarkable coincidence, I had been invited to spend the
academic year 1973–1974 with George Box and Bill Hunter at the
Statistics Department of the University of Wisconsin in Madison,
Wisconsin. Maybe from there, I could approach the great
Kowalski and, perhaps if my fortune was good, even meet him
at a conference somewhere in the USA. Maybe.
And luck was with me. In August 1973, I arrived in Madison
with my young family plus a Swedish nanny and was helped to
rent a nice house for a year. However, I was increasingly nervous
for my inability to think of a good way to approach Kowalski.
And then George Box asked me in mid-October if I could possi-
bly go to Tucson, Arizona, for an Office of Naval Research
(ONR) symposium on chemistry and computers. ONR had
collected the 20 or so chemistry professors who had applied to
ONR for a grant to buy a good computer for their lab. And
ONR needed a statistical expert to evaluate the professors’ pro-
jects with respect to feasibility and novelty. George Box did not
have time to go and proposed to the ONR to ask a young chem-
ist from Sweden who knows computers and who happens to
visit the Statistics Department this year. And ONR said “fine.”
Next Monday morning, I entered a meeting room at the Uni-
versity of Arizona in Tucson, unfortunately dressed in jeans and a
dirty black shirt because my luggage had been lost on the way.
Rudy Marcus from ONR came up to me with a slightly worried
look on his face, which disappeared when I apologized on behalf
of the airline and introduced myself as the man from Wisconsin.
He brought me to a line of 20 (or so) gentlemen in dark suits and
discrete ties, and turning to them, he said “here is our statistical
expert from Wisconsin,” and the whole line bowed. I then was
* Correspondence to: Johan Trygg, Computational Life Science Cluster, Department
of Chemistry, Umeå University, SE-901 87 Umeå, Sweden.
E-mail: johan.trygg@chem.umu.se
In memory of Bruce Kowalski, a pioneer and friend.
a L. Eriksson
MKS Umetrics AB, Umeå, Sweden
b J. Trygg, S. Wold
Computational Life Science Cluster, Department of Chemistry, Umeå University,
SE-901 87, Umeå, Sweden
Special Issue Article
Received: 2 September 2013, Revised: 14 November 2013, Accepted: 27 November 2013, Published online in Wiley Online Library
(wileyonlinelibrary.com) DOI: 10.1002/cem.2581
J. Chemometrics (2014) Copyright © 2014 John Wiley & Sons, Ltd.