Feature Subset Selection on Multivariate Time Series with Extremely Large Spatial Features Hyunjin Yoon and Cyrus Shahabi Department of Computer Science University of Southern California Los Angeles, CA 90089-0781 {hjy,cshahabi}@usc.edu Abstract Several spatio-temporal data collected in many appli- cations, such as fMRI data in medical applications, can be represented as a Multivariate Time Series (MTS) matrix with m rows (capturing the spatial features) and n columns (capturing the temporal observations). Any data mining task such as clustering or classification on MTS datasets are usually hindered by the large size (i.e., dimensions) of these MTS items. In order to reduce the dimensions without losing the useful discriminative features of the dataset, fea- ture selection techniques are usually preferred by domain experts since the relation of the selected subset of features to the originally acquired features is maintained. In this pa- per, we propose a new feature selection technique for MTS datasets where their spatial features (i.e., number of rows) are much larger than their temporal observations (i.e., num- ber of columns), or m n. Our approach is based on Principal Component Analysis, Recursive Feature Elimina- tion and Support Vector Machines. Our empirical results on real-world datasets show that our technique significantly outperforms the closest competitor technique. 1 Introduction Feature subset selection (FSS) is a pre-processing tech- nique to identify a subset of original input features (or vari- ables) from a given dataset by removing irrelevant and/or redundant ones [1]. Feature extraction (FE) is, however, to derive new features by linearly/non-linearly mapping the original input features into more effective ones. Both FSS * This research has been funded in part by NSF grants EEC-9529152 (IMSC ERC), IIS-0238560 (PECASE) and IIS-0307908, and unrestricted cash gifts from Microsoft and Google. Any opinions, findings, and con- clusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. and FE aim at providing better features of less numbers to improve the computational cost and the generalization per- formance of subsequent predictors (e.g., classifier). As con- trary to FE that still requires all the original input features to be measured and stored for the mapping, FSS is more cost- effective in that only the selected features can be acquired after the identification, discarding all the other features for good [2]. Multivariate time series (MTS) is a series of observa- tions, x i (t); [i =2, ··· ,m; t =1, ··· ,n], made sequen- tially through time where i indexes the variables measured at each time point t. A natural representation of a single MTS is therefore an m×n matrix and a set of such data ma- trices with a fixed m but a variable n is the type of dataset in which we are interested. Note that MTS nicely repre- sents spatio-temporal data since the observed variables (the m rows of the matrix) are in general acquired from sen- sors spread over a particular region and their values (the n columns of the matrix) are measured through definite time. MTS is in general extremely high dimensional data. For example, in the EEG dataset [4] where 39 electrodes mea- sured brain signals at 256Hz sampling rate during a 5- second imaginary task, each MTS becomes a matrix of 39 × 1280 dimensions, equivalently to a 49,920 dimensional vector. In this paper, we propose a feature subset selection method for MTS datasets to reduce their dimensions. In addition to the aforementioned advantage of FSS over FE, selecting relevant original variables helps to make insight- ful interpretation and easier verification in the context of the original application domain. For example, in the EEG data, the selected original features (i.e., electrodes or channels) can be exploited to localize the neural correlates, which are not known in such detail in the neuroscience literature [4] Recursive feature elimination (RFE) embedding support vector machine (SVM) classifiers (SVM-RFE) [2] have be- come a popular feature subset selection technique for many datasets including MTS [4][2][6][11]. SVM-RFE starts Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06) 0-7695-2702-7/06 $20.00 © 2006