Latent Autoregressive Gaussian Processes Models for Robust System Identification esar Lincoln C. Mattos Andreas Damianou ∗∗ Guilherme A. Barreto Neil D. Lawrence ∗∗ Federal University of Cear´a, Dept. of Teleinformatics Engineering, Center of Technology, Campus of Pici, Fortaleza, Cear´a, Brazil (e-mail: cesarlincoln@terra.com.br; gbarreto@ufc.br). ∗∗ Dept. of Computer Science & SITraN, The University of Sheffield, Sheffield, UK (e-mail: andreas.damianou@sheffield.ac.uk; N.Lawrence@dcs.sheffield.ac.uk) Abstract: We introduce GP-RLARX, a novel Gaussian Process (GP) model for robust system identification. Our approach draws inspiration from nonlinear autoregressive modeling with exogenous inputs (NARX) and it encapsulates a novel and powerful structure referred to as latent autoregression. This structure accounts for the feedback of uncertain values during training and provides a natural framework for free simulation prediction. By using a Student-t likelihood, GP- RLARX can be used in scenarios where the estimation data contain non-Gaussian noise in the form of outliers. Further, a variational approximation scheme is developed to jointly optimize all the hyperparameters of the model from available estimation data. We perform experiments with five widely used artificial benchmarking datasets with different levels of outlier contamination and compare GP-RLARX with the standard GP-NARX model and its robust variant, GP-tVB. GP-RLARX is found to outperform the competing models by a relatively wide margin, indicating that our latent autoregressive structure is more suitable for robust system identification. Keywords: Modelling and system identification, dynamic modelling, Gaussian process, outliers, autoregressive models. 1. INTRODUCTION System identification is classically defined as the task of creating mathematical models of dynamical systems based on their inputs and observed outputs (Ljung, 1998). This general definition can be further complicated if we consider the analysis of nonlinear systems and noisy data, possibly containing outliers. In this paper we are interested in the later problem, which is very often encountered in practice. In order to account for the uncertainty in the noisy data and in the dynamics learned by the model, we follow a Bayesian approach to system identification (Peterka, 1981). In this context, Gaussian Process (GP) models provide a principled, practical, probabilistic approach to learning in kernel machines (Rasmussen and Williams, 2006) and are the main subject of our work. Since the early research on modeling dynamics with GPs, e.g. by Murray-Smith et al. (1999) and Solak et al. (2003), several contributions to GP-based system identification have been published, such as autoregressive models (Ko- cijan et al., 2005), non-stationary systems (Rottmann and Burgard, 2010), local modeling (Aˇ zman and Kocijan, 2011) and state space models (Frigola et al., 2014). Most work on GP-based system identification has been limited to the case of Gaussian noise, which implies a Gaussian likelihood. However, when one expects to have non-Gaussian observations in the form of outliers, such as impulsive noise, the estimation of the model’s hyper- parameters can be severely compromised. Furthermore, because of the nonparametric nature of the GP model, the estimation data is carried along the prediction phase, i.e. the estimation samples containing outliers and the mis- estimated hyperparameters will be used during the pre- diction stage, something which can deteriorate the model capability to generalize for unseen test data. In (Mattos et al., 2015) we reviewed some recent work on GP regression in the presence of outliers. Such models replace the Gaussian likelihood by heavy-tailed distribu- tions, such as Student-t and Laplace. While inference by GP models with Gaussian likelihood is tractable, non- Gaussian likelihood models are not, requiring the use of approximation methods, such as variational Bayes (VB) (Jordan et al., 1999) and expectation propagation (EP) (Minka, 2001). We then evaluated two robust models in the task of robust system identification: a GP model with Student-t likelihood and variational inference (GP-tVB) and a GP model with Laplace likelihood and EP infer- ence (GP-LEP). The experimental results indicated that although the robust models performed better than the standard GP, especially GP-tVB, they were still sensitive to the outliers in some scenarios. As in (Mattos et al., 2015), here we are interested in nonlinear autoregressive models with exogenous inputs (NARX) and in performance evaluation by free simulation on test data. However, the autoregressive structure and the Preprint, 11th IFAC Symposium on Dynamics and Control of Process Systems, including Biosystems June 6-8, 2016. NTNU, Trondheim, Norway Copyright © 2016 IFAC 1121