328 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 40, NO. 2, MARCH 1994 Maximum Likelihood and Lower Bounds in System Identification with Non-Gaussian Inputs Ofir Shalvi and Ehud Weinstein, Fellow, IEEE Abstract-We consider the problem of estimating the parame- ters of an unknown discrete linear system driven by a sequence of independent identically distributed (i.i.d.) random variables whose probability density function (pdf) may be non-Gaussian. We assume a general system structure that may contain causal and noncausal poles and zeros. The parameters characterizing the input pdf may also be unknown. We derive an asymptotic expression for the Cramer-Rao lower bound, and show that it is the highest (worst) in the Gaussian case, indicating that the estimation accuracy can only be improved when the input pdf is non-Gaussian. It is further shown that the asymptotic error variance in estimating the system parameters is unaffected by lack of knowledge of the pdf parameters, and vice versa. Computationally efficient gradient-based algorithms for finding the maximum likelihood estimate of the unknown system and pdf parameters, which incorporate backward filtering for the identification of non-causal parameters, are presented. The dual problem of blind deconvolutiodequalization is considered, and asymptotically attainable lower bounds on the equalization per- formances are derived. These bounds imply that it is preferable to work with compact equalizer structures characterized by a small number of parameters as the attainable performance depend only on the total number of equalizer parameters. Index Terms- Nonminimum-phase ARMA models, non- Gaussianity, maximum likelihood estimation, Cramer-Rao lower bound, system identification, blind deconvolutiodequalization. I. INTRODUCTION HE problem of identifying the parameters of a discrete T linear shift-invariant system and recovering its input by observation of its output is of considerable interest in time series and spectral analysis, filtering and prediction, communications, control, speech and image processing and compression, econometrics, and more. Most of the approaches are based on the assumption that the linear system is causal and minimum phase, i.e., that all its poles and zeros are inside the unit circle of the z plane, and that its input is a sequence of independent identically distributed (i.i.d.) random variables having a Gaussian probability distribution (e.g., see [21, [31, Manuscript received October 9, 1991; revised April 2, 1993. This work was supported by the Wolfson Research Awards administrated by the Israel Academy of Science and Humanities at Tel-Aviv University and by the Charles Clore Fellowship administrated by the Clore Foundation. 0. Shalvi is with the Department of Electrical Engineering-Systems, Faculty of Engineering, Tel-Aviv University, Ramat-Aviv, Tel-Aviv 69978, Israel. E. Weinstein is with the Department of Electrical Engineering-Systems, Faculty of Engineering, Tel-Aviv University, Ramat-Aviv, Tel-Aviv 69978, Israel. He is also with the Woods Hole Oceanogaphic Institution, Woods Hole, MA 02543 USA. IEEE Log Number 9400230. [6], 171, [ 131, [20], [22], [41]). However, in many problems of interest, there is no physical justification for these assumptions. Noncausal poles and zeros appear in situations where going “backward ir time” is equivalent to going “forward in time,” e.g., when the “time axis” is a location in an image 1331, [38]. Noncausal poles may also approximate a rising impulse response of ,I system and this is why in deconvolution and equalization problems the inverse of the system is closely approximated by a non-minimum phase filter (see, e.g., [5], [ I l l , [181, i191, [ W ) . If the input to the system is assumed to be Gaussian, its output is also a Gaussian process, completely characterized by its second-order statistics, i.e. correlation or power spectrum. Consequently, it provides information only on the magnitude of the system’s transfer function. In order to identify the phase of the system, one must relax the Gaussian assumption. A maximum likelihood (ML) approach and the Cramer-Rao lower bound (CRLB) for the case where the system’s input is non-Gaussian is developed in [2], [22], and the special case of non-Gaus;ian all-pole (AR) models is developed in [23] and analyzed in 1311, [32]. Of particular importance is the observation in 1311 that the CRLB is the highest (worst) in the Gaussian case. It implies that the asymptotic variance of the ML parameter estimates can only be reduced with non- Gaussian probability distributions. However, these results and analyses are restricted to stable causal minimum phase systems and therefore are limited in applicability, particularly to blind deconvolution problems involving phase identification. There are a variety of methods for system identification and deconvcldution based on higher order statistics, which inherently assume that the observed process is non-Gaussian (see, e.g., 1141, [19], [21], [26], [35]-[37]). These methods are capable of identifying both the magnitude and the phase of the systen’s transfer function, but they have no claim to optimality in the sense of minimizing the error variance of the system parameter estimates, or minimizing the mean square restoration error. An exception is the methods in [4] that attempt to achieve asymptotically efficient deconvolu- tion. In this paper we consider the general case in which the linear systeml may be noncausal and nonminimum phase, and its input may be non-Gaussian. In Section I1 we present the model. In section 111 we derive an asymptotic expression for the log-likelihood function and the log-likelihood gradient. In section IV we derive an asymptotic expression for the CRLB. In Section \‘ we present gradient-based algorithms for ML 0018-9448/94$04.00 0 1994 IEEE.