Pergamon Computers chem. Engng, Vol. 21, Suppl., pp. S 1167"S1172, 1997 © 1997 Elsevier Science Ltd All rights reserved Printed in Great Britain PII:S0098-1354(97)00207-X 0098-1354197 $17.00+0.00 Multiscale Rectification of Random Errors without Fundamental Process Models Bhavik R. Bakshi, Prakhar Bansal and Mohamed N. Nounou Department of Chemical Engineering The Ohio State University Columbus, OH 43210, USA Abstract - Data rectification is the task of removing errors from measured process data, and is of paramount importance for the efficient execution of other process operation tasks. Existing methods for rectification represent the measured variables at a single scale in the time or frequency domain. This representation is inefficient for rectification of data containing multiscale features such as, contributions from events of different duration in time and frequency, and non-white stochastic errors. In this paper, a new class of methods is developed for the rectification of random errors based on representing the measured variables at multiple scales by decomposition on time-frequency localized basis functions derived from orthonormal wavelets. A new technique is developed for the on-line rectification of stationary random errors in the absence of fundamental or empirical process models. This rectification method eliminates basis function coefficients smaller than a threshold, and provides better rectification than that by the widely used method of exponential smoothing. The threshold for rectification is derived from a multiscalc model of the errors, which may be estimated from the multiscale decomposition of the measured data. If multiple redundant measured variables are available, then the data may be rectified by extracting an empirical model relating the variables, by methods such as principal components analysis. A new multiscale PCA method is developed that provides better rectification than PCA, by simultaneously extracting the relationship among the variables and among the measurements. The performance of the multiscale univariate filtering and multiscale PCA are illustrated by several examples, and areas for future research are identified. INTRODUCTION Data rectification is the task of removing errors from measured data, and is essential for the proper execution of other process operation tasks such as, control, monitoring and fault diagnosis. The errors contained in measured data belong to two categories: random or Gaussian errors, and non-random or gross errors, both of which need to be removed by rectification. Data rectification is inherently an ill-posed problem, since given just the measured data, it is impossible to rectify it without some knowledge or assumptions about the nature of the errors or the variables. Depending on the type of this additional information used, data rectification methods may be classified into the following major categories. • Rectification based on fundamental process models attempts to remove errors by exploiting redundancy in the measured variables and constraining the variables to satisfy a fundamental process model. This approach has received much attention by using steady-state and dynamic process models, as reviewed by Kramer and Mah (1994). The quality of rectification depends on the accuracy of available process models. • Rectification based on empirical process models is used when accurate process models are not available, but the measured variables are redundant. The data are rectified based on an empirical process model derived from the measured data. Empirical modeling techniques for data rectification extract a model between the measured variables and their rectified states by techniques such as, linear and nonlinear principal component analysis (Kramer, 1992), and recurrent neural networks (Karjala and Himmelblau, 1994). • Rectification based on model of the errors is used when it is not possible to derive empirical process models due to a lack of adequate data or redundancy between the measured variables. Measured data may then be rectified by univariate filtering based on assumptions or knowledge about the nature of the errors. These rectification methods are among the simplest and most widely used techniques in the chemical process industry, and include methods such as, exponential smoothing and median filtering (Tham and Parr, 1994). Current rectification methods in each category represent the measured variables at a single scale, either in the time, or in the frequency domain. This single-scale representation forces the rectification methods to trade-off the quantity of errors removed with the accuracy of the features retained in the rectified signal. Consequently, as more errors are removed from the measured data, the distortion of the features retained in the rectified signal increases. This distortion is larger for variables containing contributions from events occurring at different locations and/or duration in time and frequency. Furthermore, single-scale rectification methods are best suited for the removal of scale-invariant errors such as, white or uncorrelated stochastic processes. Unfortunately, errors in process data are often autocorrelated or nonstationary, causing single-scale methods to be unsatisfactory. This disadvantage of single-scale methods may be overcome by developing a time-series model for whitening the errors (Kao et al., 1991; Karjala and Himmeiblau, 1996), but this increases the complexity of the rectification method. Sl167