Canonical Correlation Approach to Common Spatial Patterns Eunho Noh 1 and Virginia R. de Sa 2 Abstract— Common spatial patterns (CSPs) are a way of spa- tially ﬁltering EEG signals to increase the discriminability be- tween the ﬁltered variance/power between the two classes. The proposed canonical correlation approach to CSP (CCACSP) utilizes temporal information in the time series, in addition to exploiting the covariance structure of the different classes, to ﬁnd ﬁlters which maximize the bandpower difference between the classes. We show with simulated data, that the unsupervised canonical correlation analysis (CCA) algorithm is better able to extract the original class-discriminative sources than the CSP algorithm in the presence of large amounts of additive Gaussian noise (while the CSP algorithm is better at very low noise levels) and that our CCACSP algorithm is a hybrid, yielding good performance at all noise levels. Finally, experiments on data from the BCI competitions conﬁrm the effectiveness of the CCACSP algorithm and a merged CSP/CCACSP algorithm (mCCACSP). I. INTRODUCTION Brain computer interfaces (BCIs) are devices that allow interaction between humans and computers using the brain signals of the user. One commonly used method to extract meaningful information in EEG (electroencephalography)- based BCI is to detect event-related desynchronization result- ing from motor imagery (MI) of different limbs of the body. One of the most commonly used and effective spatial ﬁltering methods used for feature extraction in MI tasks is common spatial patterns (CSPs) [2]. CSP gives ﬁlters which maximize the variance/power for one class while minimizing it for the other which increases the discriminability of the two classes when using bandpower features for classiﬁcation. However, CSP does not take into account the dependence between time samples. CSP is also sensitive to noise and artifacts which are common in EEG signals [10]. Many variants of CSP have been developed to improve the performance of CSP. Various regularization methods for CSPs (RCSP) have been proposed as reviewed in [9]. Common spatio-spectral patterns (CSSP)[7] uses the temporal structure information to improve CSP. Spectrally weighted common spatial patterns (Spec-CSP)[13] learns the spectral weights as well as the spatial weights in an iterative way. Invariant CSP (iCSP)[3] minimizes variations in the EEG signal caused by various artifacts using a pre- calculated covariance matrix characterizing these modula- tions. Stationary CSP (sCSP) [11] regularizes CSP ﬁlter into stationary subspaces. Local temporal common spatial patterns (LTCSP)[14], [15] uses temporally local variances to compute the spatial ﬁlters. 1 Eunho Noh is with the Department of Electrical & Computer Engineer- ing, UCSD, La Jolla, CA eunoh@ucsd.edu 2 Virginia R. de Sa is with Faculty of Cognitive Science, UCSD, La Jolla, CA desa@ucsd.edu In this paper, we propose a method called canonical correlation approach to common spatial patterns (CCACSP) which incorporates the temporal structure of the data to extract discriminative and uncorrelated sources. The method was applied to a simulated dataset as well as two BCI competition datasets to compare our approach to the standard CSP algorithm. II. METHODS A. CSP algorithm CSP ﬁnds spatial ﬁlters that maximize the variance/power of spatially ﬁltered signals under one condition while min- imizing it for the other condition. Let a column vector x t ∈ R C be the bandpassed EEG signal for time t where C is the number of EEG channels on the scalp and X = (x 1 , ..., x L ) ∈ R C×L be a length L sequence of these EEG signals. The estimate of the normalized covariance matrix Σ y ∈ R C×C can be calculated as follows: Σ y = 1 |Υ y |  i∈Υy X i X ⊤ i tr(X i X ⊤ i ) (1) where y ∈{1, 2}. The optimal set of CSP ﬁlters can be found by optimizing the following Rayleigh quotient R(w)= w ⊤ Σ 1 w w ⊤ (Σ 1 +Σ 2 )w . (2) The solution can be found by solving the generalized eigenvalue problem given in the form Σ 1 w = λ(Σ 1 +Σ 2 )w [2]. The generalized eigenvector w ∗ 1 corresponding to the largest eigenvalue maximizes the variance for class 1 while minimizing for class 2. The generalized eigenvector w ∗ C corresponding to the smallest eigenvalue maximizes the variance for class 2 while minimizing for class 1. B. CCA algorithm for separating uncorrelated sources The goal of the CCA algorithm is to ﬁnd a pair of projection vectors v,w ∈ R C that maximize the correlation between two signal spaces (in general, the two signal spaces may have different dimensionality) [1]. Let Σ 12 and Σ 21 be the cross-covariance matrices of data matrices X 1 and X 2 , then the pair of optimal vectors v and w can be found by solving the following eigenvalue problems Σ −1 1 Σ 12 Σ −1 2 Σ 21 v = λ 2 v (3) Σ −1 2 Σ 21 Σ −1 1 Σ 12 w = ρ 2 w. (4)