SUPPORT VECTOR REGRESSION FOR BLACK-BOX SYSTEM IDENTIFICATION Arthur Gretton , Arnaud Doucet , Ralf Herbrich , Peter J. W. Rayner and Bernhard Sch¨ olkopf Signal Processing Group, University of Cambridge Department of Engineering, Trumpington Street CB2 1PZ, Cambridge, UK alg30,pjwr @eng.cam.ac.uk Department of Electrical and Electronic Engineering The University of Melbourne Victoria 3010, Australia a.doucet@ee.mu.oz.au Microsoft Research, Cambridge St. George House, 1 Guildhall Street Cambridge, CB2 3NH, UK rherb,bsc @microsoft.com ABSTRACT In this paper, we demonstrate the use of support vector regres- sion (SVR) techniques for black-box system identification. These methods derive from statistical learning theory, and are of great theoretical and practical interest. We briefly describe the theory underpinning SVR, and compare support vector methods with other approaches using radial basis networks. Finally, we apply SVR to modeling the behaviour of a hydraulic robot arm, and show that SVR improves on previously published results. 1. INTRODUCTION System identification of nonlinear black-box models is a crucial but complex problem. There have been numerous recent papers in the area based on neural networks, wavelet networks, hing- ing hyperplanes, etc. Roughly speaking, one selects a set of re- gressors/basis functions, and tries to determine the number of ba- sis/regressors and their parameters according to a given statisti- cal criterion. Many methods are based on a penalised maximum likelihood criterion. Performing model selection and estimation is usually a difficult task, however, as it involves solving complex integration and/or optimisation problems. Gradient methods are often used, but are only guaranteed to converge toward local op- tima. Recently, in a Bayesian framework, Markov chain Monte Carlo algorithms have also been developed. These methods are computationally intensive, however. We propose here an alternative approach based on support vec- tor machines. These comprise a set of powerful tools to perform classification and regression [8], and have become very popular recently in the machine learning community. This approach, mo- tivated by Statistical Learning Theory [10], is systematic and prin- cipled. One can list its main advantages: There are very few free parameters to adjust. Estimating the unknown parameters only involves optimi- sation of a convex cost function. This can be achieved using standard quadratic programming algorithms. This is fast and there are no local minima. The model constructed depends explicitly on the most “in- formative” data (the support vectors). It is possible to obtain theoretical bounds on the generalisa- tion error and the sparseness of the solution (see [8]). These bounds are independent of the distribution generating the training and test data. To the best of our knowledge, support vector regression (SVR) has never been used in the context of system identification, al- though it has been used in estimating time series by M¨ uller et al. [4], and Mattera and Haykin [3]. This work differs from these pre- vious studies in that it investigates the -SVR method [5], which does not require us to specify an a priori level of accuracy. We demonstrate the application of this algorithm to modeling a stan- dard data set, and show that it is possible to obtain results that im- prove on current state-of-the-art methods [6], [7], with very little tuning. 2. BLACK-BOX SYSTEM IDENTIFICATION The problem of nonlinear black-box system identification consists of conducting non-parametric regression, as described in Sj¨ oberg et al. [6], [7], among others. This means that random variables , which take values in , are generated according to a distribution , and we are required to estimate the regression function on , or We call the regressor, and the output. We further define and . We want to estimate from the training sample