JOURNAL OF CHEMOMETRICS J. Chemometrics 2005; 19: 23–31 Published online in Wiley InterScience (www.interscience.wiley.com). DOI: 10.1002/cem.903 Multivariate statistical process control using mixture modelling U. Thissen 1 , H. Swierenga 2 , A. P. deWeijer 2 , R. Wehrens 1 ,W. J. Melssen 1 and L. M. C. Buydens 1 * 1 Analytical Chemistry,Institute for Molecules and Materials (IMM), Radboud University Nijmegen,Toernooiveld1, NL-6525 ED Nijmegen, The Netherlands 2 TeijinTwaron Research Institute,Postbus 9600,NL-6800 TC Arnhem,The Netherlands Received 17 December 2004; Revised 26 January 2005; Accepted 28 February 2005 When performing process monitoring, the classical approach of multivariate statistical process control (MSPC) explicitly assumes the normal operating conditions (NOC) to be distributed normally. If this assumption is not met, usually severe out-of-control situations are missed or in- control situations can falsely be seen as out-of-control. Combining mixture modelling with MSPC (MM-MSPC) leads to an approach in which non-normally distributed NOC regions can be described accurately. Using the expectation maximization (EM) algorithm, a mixture of Gaussian functions can be defined that, together, describe the data well. Using the Bayesian information criterion (BIC), the optimal set of Gaussians and their specific parametrization can be determined easily. Artificial and industrial data sets have been used to test the performance of the combined MM-MSPC approach. From these applications it has been shown that MM-MSPC is very promising: (1) a better description of the process data is given compared with standard MSPC and (2) the clusters found can be used for a more detailed process analysis and interpretation. Copyright # 2005 John Wiley & Sons, Ltd. KEYWORDS: model-based clustering; process monitoring; kernel density estimation (KDE) 1. INTRODUCTION Both univariate statistical process control (SPC) and multi- variate SPC (MSPC) charts have been used successfully in many chemical applications [1–5]. These charts are well- known tools for monitoring process behaviour, e.g. the univariate Shewhart chart or the multivariate principal component analysis (PCA)-based combination of Hotelling’s T 2 and the Q-plot. The aim of these plots is to detect whether a process deviates significantly from a predefined level, the so-called normal operating conditions (NOC). For this pur- pose, parametric hypothesis tests are used, explicitly assum- ing a normal distribution. However, if (M)SPC is used in cases where this assump- tion is incorrect, inaccurate evaluations of the process quality are made. This leads to situations where significant devia- tions from the NOC can be missed while proper process behaviour can be seen as out-of-control. For instance, a non- normally distributed NOC can occur for non-linear, dynamic or multistate processes. For these cases, other (non-linear) methods than PCA might be used for monitoring. Alterna- tively, a separate MSPC procedure could be set up for each working point. In the past, non-parametric methods have been proposed for describing the NOC more accurately in cases where the normality assumption is not valid. Several of these approaches have been based on the (principles of the) Parzen window method [6,7], which is a kernel density estimation (KDE) method. With KDE, each training point is designated as a unit centre, and an identical basis function (the kernel, usually a Gaussian function) is constructed at each centre. A data set is then described by the total density of each data point for the set of kernels. A certain limit density describes the total shape of the data. For example, Doymaz et al. [8] use the KDE approach for defining the NOC, while Martin and Morris [9] modified this approach in order to select a non-parametric kernel based on bootstrap- ping. A possible drawback of the KDE approach is that differences in local densities cannot be modelled well owing to the identical spherical kernels used. Furthermore, the width of the kernel has to be determined using a method such as cross-validation, which can be computationally intensive. However, other studies describe approaches that can use basis functions with different covariance matrices to describe the local character of the data. For instance, John- ston and Kramer [10] use variable elliptical basis functions for process monitoring (fault detection) and fault identifica- tion. Their method requires two parameters to be optimized (e.g. with cross-validation): the number of basis functions and the overlap parameter. In Reference [11] also, fault *Correspondence to: L. M. C. Buydens, Analytical Chemistry, Institute for Molecules and Materials (IMM), Radboud University Nijmegen, Toernooiveld 1, NL-6525 ED Nijmegen, The Netherlands. E-mail: l.buydens@science.ru.nl Contract/grant sponsor: Dutch Technology Foundation STW; Contract/grant number: NCH4501. Copyright # 2005 John Wiley & Sons, Ltd.