Identification of significant factors by an extension of ANOVA–PCA based on
multi-block analysis
D. Jouan-Rimbaud Bouveresse
a,b,
⁎, R. Climaco Pinto
b,c
, L.M. Schmidtke
d
, N. Locquet
a,b
, D.N. Rutledge
a,b
a
INRA, UMR 1145 Ingénierie Procédés Aliments, F-75005, Paris, France
b
AgroParisTech, UMR 1145 Ingénierie Procédés Aliments, F-75005 Paris, France
c
Computational Life Science Cluster (CliC), KBC, UmeåUniversity, S-90187, Umeå, Sweden
d
National Wine and Grape Industry Center, School of Agriculture and Wine Sciences, Charles Sturt University, Wagga Wagga, NSW 2650, Australia
abstract article info
Article history:
Received 14 December 2009
Received in revised form 10 May 2010
Accepted 13 May 2010
Available online 25 May 2010
Keywords:
Multi-block analysis
Common Component and Specific Weights
Analysis
ComDim
ANOVA–PCA
F-test
A modification of the ANOVA–PCA method, proposed by Harrington et al. to identify significant factors and
interactions in an experimental design, is presented in this article. The modified method uses the idea of
multiple table analysis, and looks for the common dimensions underlying the different data tables, or data
blocks, generated by the “ANOVA-step” of the ANOVA–PCA method, in order to identify the significant
factors. In this paper, the “Common Component and Specific Weights Analysis” method is used to analyse
the calculated multi-block data set. This new method, called AComDim, was compared to the standard
ANOVA–PCA method, by analysing four real data sets. Parameters computed during the AComDim procedure
enable the computation of F-values to check whether the variability of each original data block is
significantly greater than that of the noise.
© 2010 Elsevier B.V. All rights reserved.
1. Introduction
Several multi-block analysis procedures exist for the simultaneous
study of multiple sets of matrices with different variables describing
the same samples (for example, see [1–4]). These methods may be
useful in chemometrics to combine information about the same set of
samples contained in signals acquired using different techniques (IR
spectroscopy; Raman spectroscopy; physico–chemical analyses; etc.).
One such multi-block technique is “Common Component and Specific
Weights Analysis”—CCSWA [5].
The objective of multi-block analysis methods is to describe p data
blocks observed for the same n samples (i.e. a set of p data matrices
(X
i
, i = 1 to p) each with n rows, but not necessarily the same number
of variables). The method consists in determining a common space for
all p data blocks, with each matrix having a specific contribution
(“salience”) to the definition of each dimension of this common space.
This is done by finding the directions describing common distribu-
tions of the samples in the spaces defined by the different data blocks
(hence the name Common Component, abbreviated CC or Common
Dimension, abbreviated CD). Salience indicates the importance of each
block in the construction of the common dimension, and a “percentage
of variability extracted” by each dimension can be computed. The
particular implementation of CCSWA used in this work, “ComDim”, was
developed and coded in Matlab [6] by D. Bertrand [7].
The work presented in this article shows that an interesting
extension of ComDim is to use it in the analysis of sets of blocks calcu-
lated from a single initial data matrix. AComDim, presented here, is one
such application, based on replacing the many separate PCAs performed
in the ANOVA–PCA method [8], also abbreviated APCA, by a single
analysis using ComDim. In this case, the various “Factor matrices” and
“Interaction matrices” calculated from the initial data matrix are all
analysed simultaneously, resulting in a series of “Common Components”
along which the samples are distributed, each associated with a vector of
“saliences” reflecting the importance of the contribution of each data
block to the corresponding “Common Component”.
After a brief presentation of both the ComDim and the APCA
methods, this article will present several real case studies, showing
the interest of this new method, particularly in comparison to the
standard APCA method.
2. Theory
2.1. Notation
Matrices will be denoted by bold uppercase letters (e.g., X),
column vectors will be denoted by bold lowercase letters (e.g., u), and
row vectors by bold lowercase letters followed by the uppercase
Chemometrics and Intelligent Laboratory Systems 106 (2011) 173–182
⁎ Corresponding author. INRA, UMR 1145 Ingénierie Procédés Aliments, F-75005,
Paris, France. Tel.: +33 1 44 08 16 39.
E-mail address: delphine.bouveresse@agroparistech.fr
(D. Jouan-Rimbaud Bouveresse).
0169-7439/$ – see front matter © 2010 Elsevier B.V. All rights reserved.
doi:10.1016/j.chemolab.2010.05.005
Contents lists available at ScienceDirect
Chemometrics and Intelligent Laboratory Systems
journal homepage: www.elsevier.com/locate/chemolab