IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 40, NO. 2, APRIL 2010 517 A Division Algebraic Framework for Multidimensional Support Vector Regression Alistair Shilton, Member, IEEE, Daniel T. H. Lai, Member, IEEE, and Marimuthu Palaniswami, Senior Member, IEEE Abstract—In this paper, division algebras are proposed as an el- egant basis upon which to extend support vector regression (SVR) to multidimensional targets. Using this framework, a multitarget SVR called ǫ X -SVR is proposed based on an ǫ-insensitive loss function that is independent of the coordinate system or basis used. This is developed to dual form in a manner that is analogous to the standard ǫ-SVR. The ǫ H -SVR is compared and contrasted with the least-square SVR (LS-SVR), the Clifford SVR (C -SVR), and the multidimensional SVR (M-SVR). Three practical applications are considered: namely, 1) approximation of a complex-valued function; 2) chaotic time-series prediction in 3-D; and 3) com- munication channel equalization. Results show that the ǫ H -SVR performs significantly better than the C -SVR, the LS-SVR, and the M-SVR in terms of mean-squared error, outlier sensitivity, and support vector sparsity. Index Terms—Clifford algebra, complex numbers, division algebra, multidimensional regression, multiple input–multiple output (MIMO), quaternions, support vector regression (SVR). I. I NTRODUCTION S UPPORT vector regressors (SVRs) [1] are a generaliza- tion of Vapnik’s support vector machine (SVM) method for binary classification [2] to the estimation of real-valued functions based on the principle of structural risk minimization [1]. Model nonlinearity is achieved by implicitly mapping the input data space to feature space using the well-known kernel trick. In this feature space, a linear function is constructed by minimizing a regularized risk function, wherein the model’s sensitivity to noise is controlled by the regularization term [3]. The standard SVR formulation has recently been extended to the estimation of functions with multidimensional outputs [4]– [7] (multiregression [8], [9]), thereby increasing their range of applicability. Multidimensional function estimation is impor- tant in areas such as telecommunications [4], biotechnology [10], pervasive computing [11], meteorology [12], and evo- lutionary computation [9], where the modeled systems have multiple output types or streams. These coupled outputs could be expressed in terms of several independent real numbers at Manuscript received October 3, 2008; revised February 19, 2009 and May 29, 2009. First published September 4, 2009; current version published March 17, 2010. This work was supported in part by Department of Education, Science and Training (DEST) International Science Linkages (ISL) and in part by the Australian Research Council. This paper was recommended by Associate Editor Z. R. Yang. A. Shilton and M. Palaniswami are with the Department of Electrical and Electronic Engineering, The University of Melbourne, Melbourne, Vic. 3010, Australia (e-mail: apsh@ee.unimelb.edu.au; swami@ee.unimelb.edu.au). D. T. H. Lai is with the Centre for Ageing, Rehabilitation, Exercise and Sport, Victoria University, Melbourne, Vic. 8001, Australia (e-mail: daniel. lai@vu.edu.au). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TSMCB.2009.2028314 the expense of output coupling. The alternative is to repre- sent the targets using complex numbers (e.g., communications channels), quaternions (e.g., position in 3-D space) [13]–[15], or some similarly holistic object. This is hypothesized to be a better approach than the former because of the following reasons: 1) In the former, each SVR is specifically constructed for an individual output, resulting in a combined risk func- tion dependent on the choice of coordinate system. This means that the choice of component axis of the algebraic system (which may, for example, correspond to the x-, y-, and z-axes in a geometrical problem) influences the estimation of errors and hence affects the final trained machine. 2) Multiple single SVRs do not take advantage of the rich structure of other algebras, disregarding potential system output couplings. 3) The training of multiple individual regressors may incur a significant computational cost for larger problems. The design of multidimensional SVRs based on an alge- braic framework better suited to the representation of mul- tidimensional targets has previously been attempted. Bayro- Corrochano et al. used Clifford algebra to extend the SVR to Clifford algebraic targets with their Clifford SVR (C-SVR) [5], [16]. Clifford (or geometric) algebra allows geometric objects, such as vectors, cubes, etc., to directly be represented and manipulated in an algebraic framework. Unfortunately, it is difficult to construct a dual form based on an axis (coordinate) independent loss function within this framework. In another approach, Pérez-Cruz et al.’s SVM multiregression (M-SVR) [4], [17] used a vectorial representation and addressed the axis dependence by deploying a quadratic loss function with Vapnik’s ǫ-insensitive zone. Unfortunately, this approach does not result in a well-defined dual, leading to algorithmic imple- mentation issues, and, moreover, is more sensitive to outliers than the C-SVR. Weston et al. [7] sought to generalize the SVM to multidimensional inputs and outputs by introducing the kernel density estimation, which models targets as set members and uses arbitrary loss functions for error penalization. The limitation of this generalization is an increased computational complexity, where one is forced to solve three nontrivial prob- lems for every machine. Another approach to multitarget regression inspired not by the SVR but rather by neural networks is the extreme learning machine (ELM) [18], [19], which is a single-layer feedforward neural network with random input weights and biases. By se- lecting the output weights to minimize the least-square training 1083-4419/$26.00 © 2009 IEEE