IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART B: CYBERNETICS, VOL. 40, NO. 2, APRIL 2010 517
A Division Algebraic Framework for
Multidimensional Support Vector Regression
Alistair Shilton, Member, IEEE, Daniel T. H. Lai, Member, IEEE, and Marimuthu Palaniswami, Senior Member, IEEE
Abstract—In this paper, division algebras are proposed as an el-
egant basis upon which to extend support vector regression (SVR)
to multidimensional targets. Using this framework, a multitarget
SVR called ǫ
X
-SVR is proposed based on an ǫ-insensitive loss
function that is independent of the coordinate system or basis used.
This is developed to dual form in a manner that is analogous to the
standard ǫ-SVR. The ǫ
H
-SVR is compared and contrasted with
the least-square SVR (LS-SVR), the Clifford SVR (C -SVR), and
the multidimensional SVR (M-SVR). Three practical applications
are considered: namely, 1) approximation of a complex-valued
function; 2) chaotic time-series prediction in 3-D; and 3) com-
munication channel equalization. Results show that the ǫ
H
-SVR
performs significantly better than the C -SVR, the LS-SVR, and
the M-SVR in terms of mean-squared error, outlier sensitivity, and
support vector sparsity.
Index Terms—Clifford algebra, complex numbers, division
algebra, multidimensional regression, multiple input–multiple
output (MIMO), quaternions, support vector regression (SVR).
I. I NTRODUCTION
S
UPPORT vector regressors (SVRs) [1] are a generaliza-
tion of Vapnik’s support vector machine (SVM) method
for binary classification [2] to the estimation of real-valued
functions based on the principle of structural risk minimization
[1]. Model nonlinearity is achieved by implicitly mapping the
input data space to feature space using the well-known kernel
trick. In this feature space, a linear function is constructed by
minimizing a regularized risk function, wherein the model’s
sensitivity to noise is controlled by the regularization term [3].
The standard SVR formulation has recently been extended to
the estimation of functions with multidimensional outputs [4]–
[7] (multiregression [8], [9]), thereby increasing their range of
applicability. Multidimensional function estimation is impor-
tant in areas such as telecommunications [4], biotechnology
[10], pervasive computing [11], meteorology [12], and evo-
lutionary computation [9], where the modeled systems have
multiple output types or streams. These coupled outputs could
be expressed in terms of several independent real numbers at
Manuscript received October 3, 2008; revised February 19, 2009 and
May 29, 2009. First published September 4, 2009; current version published
March 17, 2010. This work was supported in part by Department of Education,
Science and Training (DEST) International Science Linkages (ISL) and in part
by the Australian Research Council. This paper was recommended by Associate
Editor Z. R. Yang.
A. Shilton and M. Palaniswami are with the Department of Electrical and
Electronic Engineering, The University of Melbourne, Melbourne, Vic. 3010,
Australia (e-mail: apsh@ee.unimelb.edu.au; swami@ee.unimelb.edu.au).
D. T. H. Lai is with the Centre for Ageing, Rehabilitation, Exercise and
Sport, Victoria University, Melbourne, Vic. 8001, Australia (e-mail: daniel.
lai@vu.edu.au).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TSMCB.2009.2028314
the expense of output coupling. The alternative is to repre-
sent the targets using complex numbers (e.g., communications
channels), quaternions (e.g., position in 3-D space) [13]–[15],
or some similarly holistic object. This is hypothesized to be
a better approach than the former because of the following
reasons:
1) In the former, each SVR is specifically constructed for
an individual output, resulting in a combined risk func-
tion dependent on the choice of coordinate system. This
means that the choice of component axis of the algebraic
system (which may, for example, correspond to the x-,
y-, and z-axes in a geometrical problem) influences the
estimation of errors and hence affects the final trained
machine.
2) Multiple single SVRs do not take advantage of the rich
structure of other algebras, disregarding potential system
output couplings.
3) The training of multiple individual regressors may incur
a significant computational cost for larger problems.
The design of multidimensional SVRs based on an alge-
braic framework better suited to the representation of mul-
tidimensional targets has previously been attempted. Bayro-
Corrochano et al. used Clifford algebra to extend the SVR to
Clifford algebraic targets with their Clifford SVR (C-SVR) [5],
[16]. Clifford (or geometric) algebra allows geometric objects,
such as vectors, cubes, etc., to directly be represented and
manipulated in an algebraic framework. Unfortunately, it is
difficult to construct a dual form based on an axis (coordinate)
independent loss function within this framework. In another
approach, Pérez-Cruz et al.’s SVM multiregression (M-SVR)
[4], [17] used a vectorial representation and addressed the
axis dependence by deploying a quadratic loss function with
Vapnik’s ǫ-insensitive zone. Unfortunately, this approach does
not result in a well-defined dual, leading to algorithmic imple-
mentation issues, and, moreover, is more sensitive to outliers
than the C-SVR. Weston et al. [7] sought to generalize the
SVM to multidimensional inputs and outputs by introducing the
kernel density estimation, which models targets as set members
and uses arbitrary loss functions for error penalization. The
limitation of this generalization is an increased computational
complexity, where one is forced to solve three nontrivial prob-
lems for every machine.
Another approach to multitarget regression inspired not by
the SVR but rather by neural networks is the extreme learning
machine (ELM) [18], [19], which is a single-layer feedforward
neural network with random input weights and biases. By se-
lecting the output weights to minimize the least-square training
1083-4419/$26.00 © 2009 IEEE