Received: 19 April 2008, Revised: 12 June 2008, Accepted: 13 June 2008, Published online in Wiley InterScience: 18 July 2008 PLS works R. Bro a * and L. Elde ´n b In a recent paper, claims were made that most current implementations of PLS provide wrong and misleading residuals [1]. In this paper the relation between PLS and Lanczos bidiagonalization is described and it is shown that there is a good rationale behind current implementations of PLS. Most importantly, the residuals determined in current implementations of PLS are independent of the scores used for predicting the dependent variable(s). Oppositely, in the newly suggested approach, the residuals are correlated to the scores and hence may be high due to variation that is actually used for predicting. It is concluded that the current practice of calculating residuals be maintained. Copyright ß 2008 John Wiley & Sons, Ltd. Keywords: PLS; NIPALS; Lanczos bidiagonalization 1. THEORY The problem in PLS is to determine an approximate solution to the regression model: min y Xb k k (1) Here y is the vector holding the dependent variable, X, a matrix holding the independent variables and b a vector of sought regression coefficients. The traditional PLS algorithms are based on using loading weights, W, as well as loadings, P . The components are calculated in a sequential manner. Given the preprocessed data, X 0 , the first score vector is determined as t 1 ¼ X 0 w X 0 w k k (2) Normalization of the scores is used here for convenience. Subsequently, the X 0 data are deflated using X 1 ¼ X 0 t 1 p T 1 (3) The next component is determined from X 1 , etc. The loading vector p 1 is determined in a least squares sense in terms of approximating the data X 0 given t 1 : p 1 ¼ X T 0 t 1 (4) As can be understood, a several-component PLS model does not work on the overall data, but on deflated versions which makes it a bit awkward to specify the overall loss function that PLS optimizes [2]. The crucial part of a PLS model is the prediction of the dependent variable, but the model also includes an approxi- mation of X which is useful for diagnostic and exploratory purposes as well as for outlier detection. This model is given as ˆ X ¼ T k P T k (5) Residuals can hence be found as E ¼ X k ¼ X T k P T k (6) where k is the number of components used in the model. The complete algorithm [3–5] can be more formally described as PLS algorithm 1. X 0 ¼ X 2. For i ¼ 1, 2,..., k ðaÞ w i ¼ X T i 1 y X T i 1 y k k ðbÞ t i ¼ X i 1 w i X i 1 w i k k ðcÞ p i ¼ X T i 1 t i ðdÞ X i ¼ X i 1 t i p T i Introduce the notation: K i ðA; gÞ¼ spanfg; Ag; ... ; A i 1 gg (7) for the linear subspace spanned by the vectors g; Ag; ... ; A i 1 g. Proposition 1.1. The vectors w 1 , w 2 ,..., w i are orthonormal, and span the Krylov subspace K i (X T X, X T y). Similarly, the vectors t 1 , t 2 ,..., t i are orthonormal, and span the Krylov subspace K i (XX T , XX T y). Define W k ¼½w 1 w 2 ... w k T k ¼½t 1 t 2 ... t k P k ¼½p 1 p 2 ... p k (www.interscience.wiley.com) DOI: 10.1002/cem.1177 Short Communication * Correspondence to: R. Bro, Department Food Science, University of Copenha- gen, Denmark. E-mail: rb@life.ku.dk a R. Bro Department Food Science, University of Copenhagen, Denmark b L. Elde ´n Department of Mathematics, Linko ¨ping University, Sweden J. Chemometrics 2009; 23: 69–71 Copyright ß 2008 John Wiley & Sons, Ltd. 69