TECHNOMETRICS ?, VOL. 21, NO. 4, NOVEMBER 1979 Lower Rank Approximation of Matrices by Least Squares With Any Choice of Weights K. Ruben Gabriel S. Zamir Deportment of Statistics University of Rochester Rochester, NY 14627 Department of Statistics Hebrew University Jerusalem, Israel Reduced rank approximation of matrices has hitherto been possible only by unweighted least squares. This paper presents iterative techniques for obtaining such approximations when weights are introduced. The techniques involve criss-cross regressions with careful initial- ization. Possible applications of the approximation are in modelling, biplotting, contingency table analysis, fitting of missing values, checking outliers, etc. KEY WORDS Reduced rank approximation Least squares Criss-cross regression Householder-Young theorem Biplot Contingency table Outliers 1. INTRODUCTION Approximation of matrices by other matrices of lower rank plays a useful role in fitting models to data (Mandel [15], [16]; Bradu and Gabriel [1]), in graphical representation of data by means of biplots (Gabriel [4], [5]), in principal component analysis (Whittle [24]) and in other mutivariate techniques. (In fact the underlying approach of S. N. Roy and his students [19], [20], has been that of studying the rank one approximation of the data matrix). The method of approximation used in all these applications is least squares, with the solution due to Householder and Young [13] (and earlier on to Fisher and Mac- kenzie [3]) for which a variety of special computa- tional routines are available (Golub and Reinsch [11]). An alternative method of approximation is an iterative procedure in which row and column weights are inversely dependent on row and column sums of squared residuals, and weighted least squares are used in each iterative step (McNeil and Tukey [17]). This is presumably more resistant to outliers. The need for approximation by weighted least squares also arises frequently. For example, a table Received April 1977; revised October 1978 of means based on samples of widely varying sizes should be fitted with weights proportional to sample sizes. In the extreme case of zero size samples, an "entry" should play no role in fitting. This would also take care of missing values by assigning zero weights. This paper considers iterative methods of fitting lower rank least squares approximations for a gen- eral choice of weights. For an (n x m) matrix Y of elements y, it considers least squares fitting subject to weights w,j. Fitting by a matrix of rank p or less is equivalent to fitting by a matrix product AB' where A and B are n x p and m x p, respectively (Gabriel [6]). The minimizing criterion can therefore be written as D(A, B) = E wE,Jyi, - a,)2}, (1.1) i=! j=-1 where a, and bj denote rows of A and B, respectively. 2. THE CASE OF EQUAL WEIGHTS Householder and Young [13] dealt with equal weights, w,i = 1, and minimized 1 = [ Y - AB'112, the Euclidean norm of the matrix of residuals. A conve- nient method of solution (see, e.g., Good [9], page 827) deals with the columns of A, and the corre- sponding columns of B, one at a time. The solutions for a, and b--the r-th columns of A and B, respec- tively-are obtained after solutions for a,, .*-, a,-, and b,, *.. , b,_, are available and subtracted out of Y to give residuals yr-i) = y_ y ,r-I , a b,'. The equations determining a,, and b, are (,a 2)'/2 'bjr = ,a.iryi (r- 1) (2.1) (2.2) 489