Feature Subset Selection on Multivariate Time Series with
Extremely Large Spatial Features
∗
Hyunjin Yoon and Cyrus Shahabi
Department of Computer Science
University of Southern California
Los Angeles, CA 90089-0781
{hjy,cshahabi}@usc.edu
Abstract
Several spatio-temporal data collected in many appli-
cations, such as fMRI data in medical applications, can
be represented as a Multivariate Time Series (MTS) matrix
with m rows (capturing the spatial features) and n columns
(capturing the temporal observations). Any data mining
task such as clustering or classification on MTS datasets
are usually hindered by the large size (i.e., dimensions) of
these MTS items. In order to reduce the dimensions without
losing the useful discriminative features of the dataset, fea-
ture selection techniques are usually preferred by domain
experts since the relation of the selected subset of features
to the originally acquired features is maintained. In this pa-
per, we propose a new feature selection technique for MTS
datasets where their spatial features (i.e., number of rows)
are much larger than their temporal observations (i.e., num-
ber of columns), or m ≫ n. Our approach is based on
Principal Component Analysis, Recursive Feature Elimina-
tion and Support Vector Machines. Our empirical results
on real-world datasets show that our technique significantly
outperforms the closest competitor technique.
1 Introduction
Feature subset selection (FSS) is a pre-processing tech-
nique to identify a subset of original input features (or vari-
ables) from a given dataset by removing irrelevant and/or
redundant ones [1]. Feature extraction (FE) is, however, to
derive new features by linearly/non-linearly mapping the
original input features into more effective ones. Both FSS
*
This research has been funded in part by NSF grants EEC-9529152
(IMSC ERC), IIS-0238560 (PECASE) and IIS-0307908, and unrestricted
cash gifts from Microsoft and Google. Any opinions, findings, and con-
clusions or recommendations expressed in this material are those of the
author(s) and do not necessarily reflect the views of the National Science
Foundation.
and FE aim at providing better features of less numbers to
improve the computational cost and the generalization per-
formance of subsequent predictors (e.g., classifier). As con-
trary to FE that still requires all the original input features to
be measured and stored for the mapping, FSS is more cost-
effective in that only the selected features can be acquired
after the identification, discarding all the other features for
good [2].
Multivariate time series (MTS) is a series of observa-
tions, x
i
(t); [i =2, ··· ,m; t =1, ··· ,n], made sequen-
tially through time where i indexes the variables measured
at each time point t. A natural representation of a single
MTS is therefore an m×n matrix and a set of such data ma-
trices with a fixed m but a variable n is the type of dataset
in which we are interested. Note that MTS nicely repre-
sents spatio-temporal data since the observed variables (the
m rows of the matrix) are in general acquired from sen-
sors spread over a particular region and their values (the n
columns of the matrix) are measured through definite time.
MTS is in general extremely high dimensional data. For
example, in the EEG dataset [4] where 39 electrodes mea-
sured brain signals at 256Hz sampling rate during a 5-
second imaginary task, each MTS becomes a matrix of 39
× 1280 dimensions, equivalently to a 49,920 dimensional
vector. In this paper, we propose a feature subset selection
method for MTS datasets to reduce their dimensions. In
addition to the aforementioned advantage of FSS over FE,
selecting relevant original variables helps to make insight-
ful interpretation and easier verification in the context of the
original application domain. For example, in the EEG data,
the selected original features (i.e., electrodes or channels)
can be exploited to localize the neural correlates, which are
not known in such detail in the neuroscience literature [4]
Recursive feature elimination (RFE) embedding support
vector machine (SVM) classifiers (SVM-RFE) [2] have be-
come a popular feature subset selection technique for many
datasets including MTS [4][2][6][11]. SVM-RFE starts
Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06)
0-7695-2702-7/06 $20.00 © 2006