Polynomial Extension of Linear Subspace Algorithms for Stochastic Identification Corrado Di Loreto Alfredo Germani Costanzo Manes Abstract— Among the algorithms of linear models identi- fication from input/output data, the N4SID (Numerical Sub- space State Space System IDentification) plays an important role due to its simplicity and effectiveness. It is known that N4SID gives good results for system identification in a Gaus- sian setting. This paper presents a technique that improves the performances of the N4SID in the case of a non-Gaussian data set. The approach here followed is in the framework of polynomial estimation theory, developed in recent years, which is a simple and effective tool for the processing of non- Gaussian data. Index Terms—Subspace Identification, Polynomial Filter- ing, non-Gaussian noise. I. INTRODUCTION The problem of model identi¯cation from input-output data is of central importance in the control systems ap- plications. In the literature two main approaches are followed. One uses the input-output structure for dy- namical modeling of data, and originates a set of tech- niques known as prediction-error methods [1, 2]. The other, more recent, uses state-space models for the de- scription of the input-output data, and generates what are called Subspace Identi¯cation Algorithms [3, 4, 5]. It is well known that all these methods are very e®ec- tive when the data are generated by linear and Gaussian models. This paper proposes an extension of the N4SID algo- rithm [4] for the improvement of its performances when the statistics of data are far from gaussianity. It is known that the ¯rst step of the identi¯cation proce- dure in the N4SID algorithm is equivalent to a bank of Kalman ¯lters operating over the so called \past output Hankel matrix", without knowing the system matrices at all. In a second step the system matrices and the covariances of the state and input noises are computed. Of course, it is well known that the Kalman ¯lter pro- vides the optimal state estimate only when the output process is strictly Gaussian, otherwise the Kalman ¯lter provides the best estimate among all the a±ne func- tions of the data. This fact suggests that a signi¯cative This work is supported by MIUR (Italian Ministry for Educa- tion and Research), project 2003090328{003, and by CNR (Italian National Research Council). A. Germani and C. Manes are with the Dipartimento di Ingegneria Elettrica, Universitµ a degli Studi dell'Aquila, Poggio di Roio, 67040 L'Aquila, Italy, germani@ing.univaq.it, manes@ing.univaq.it. C. Di Loreto is with Telespazio S.p.A., Centro Spaziale \Piero Fanti" del Fucino, 67051 Avezzano, L'Aquila, Italy. E-mail: corrado diloreto@telespazio.it. improvement of the identi¯cation procedure can be ob- tained by the use of more accurate ¯lters operating in nongaussian framework, such as polynomial ¯lters. In [6] it is shown that the polynomial estimate of a cho- sen degree of the state of a linear system is obtained by the linear estimate of the state of a suitably extended system, constructed by exploiting Kronecker powers of the state and of the output processes up to the cho- sen degree. As the order of the polynomial increases, the covariance of the estimation error decreases. The identi¯cation approach here proposed consists in apply- ing the known subspace identi¯cation algorithms to the polynomial extended system. The paper is organized as follows. In section II the construction of a polynomial generation model of a mea- sured nongaussian sequence is illustrated. Section III describes the polynomial extension of the N4SID algo- rithm and section IV discusses the issue of the choice of the order of the generation model. Section V re- ports some simulation results, and conclusions follow. An Appendix reports some properties of the Kronecker Algebra used throughout the paper to develop the poly- nomial generation model. II. NONGAUSSIAN DATA GENERATION MODEL A set of measures y(k) 2 IR p , k =0;:::;N , of a sta- tionary process is available, where y(k) is the noisy mea- surement process of a signal s(k), i.e. y(k)= s(k)+ g(k); (1) where g(k) is a white sequence with unknown statistics. Assume that the signal admits an unknown linear time invariant (LTI) generation model, driven by a white se- quence of unknown statistics, so that the process y(k) is seen as the output of a stochastic system of the type x(k + 1) = Ax(k)+ f (k); y(k)= Cx(k)+ g(k): (2) x(k) 2 IR n , with unknown n, is the system state and Cx(k) is the signal s(k). The system matrices A and C are unknown, and the noise sequences f (k) and g(k) are assumed white with unknown stationary distribution. It is assumed that A is stable and x(k) is stationary. The N4SID algorithm applied to the sequence y(k) provides an estimate of the generation model (2), i.e., an estimate 43rd IEEE Conference on Decision and Control December 14-17, 2004 Atlantis, Paradise Island, Bahamas 0-7803-8682-5/04/$20.00 ©2004 IEEE WeB11.5 2213