HOW FAST IS FASTICA? Vicente Zarzoso, Pierre Comon Mariem Kallel Laboratoire I3S, CNRS/UNSA Les Algorithmes – Euclide-B, BP 121 06903 Sophia Antipolis Cedex, France {zarzoso, comon}@i3s.unice.fr Departement TIC, Laboratoire U2S, ENIT Campus Universitaire “Le Belv´ ed` ere” 1002 Tunis, Tunisia ABSTRACT The present contribution deals with the statistical tool of Independent Component Analysis (ICA). The fo- cus is on the deﬂation approach, whereby the inde- pendent components are extracted one after another. The kurtosis-based FastICA is arguably one of the most widespread methods of this kind. However, its features, particularly its speed, have not been thoroughly eval- uated or compared, so that its popularity seems some- what unfounded. To substantiate this claim, a simple quite natural modiﬁcation is put forward and assessed in this paper. It merely consists of performing exact line search optimization of the contrast function. Speed is objectively measured in terms of the computational complexity required to reach a given source extraction performance. Illustrative numerical results demonstrate the faster convergence and higher robustness to initial- ization of the proposed approach, which is thus referred to as RobustICA. 1. INTRODUCTION Independent Component Analysis (ICA) transforms an observed random vector into mutually statistically inde- pendent components [1]. Its numerous applications have spurred an increasing research interest in this technique; for instance, ICA is the basic statistical tool to perform Blind Source Separation (BSS) [1, 2, 3]. In its original deﬁnition (see [1, 4], among other early works), ICA ex- tracts all the sources jointly or simultaneously; this is the so-called “symmetric” approach. ICA can also be performed by estimating the sources sequentially or one by one. This alternative procedure, referred to as deﬂa- tion, was originally proposed in [5], and used successfully in the separation of convolutive mixtures [6]. Deﬂation has later been widely promoted in the machine learning community [3]. Joint algorithms are usually thought to outperform deﬂationary algorithms due to errors accu- mulated in successive subtractions (regressions) of the estimated source contribution to the observation. This shortcoming is generally claimed to be compensated by a signiﬁcant gain in computations, although this claim still requires closer examination. The FastICA [7, 8], originally put forward in deﬂa- tion mode, features among the most popular ICA algo- rithms. Although it appeared when many other ICA methods had already been proposed, the deﬂationary FastICA has never been compared by the authors of [3] with earlier joint algorithms such as COM2 [1], JADE [4], COM1 [9], or the deﬂation methods by Tugnait [6] or Delfosse-Loubaton [5]. In fact, to our knowledge, FastICA (both in its deﬂation and symmetric imple- mentations) has only been compared with neural-based adaptive algorithms and principal component analysis (PCA), that most ICA algorithms are known to outper- form. Its popularity has been justiﬁed on the grounds of the satisfactory performance oﬀered by the method in several applications, as well as its simplicity. However, these features, and in particular its speed, have never been substantiated by a thorough comparison with other techniques. A ﬁrst serious attempt has been made in [10], where FastICA is found to fail for weak or highly spatially correlated sources. In spite of its comprehen- siveness, the comparative analysis of [10] is perhaps un- fortunate in contrasting the deﬂationary FastICA with joint methods such as COM2, JADE and COM1. On the other hand, recent studies have put in evidence some deﬁciencies of FastICA, such as the detrimental eﬀects of saddle points on its performance [11]. Given the assiduous attention the method has re- ceived over the last decade, these gaps are somewhat surprising. Indeed, it does not seem diﬃcult to envis- age a very simple, quite natural deﬂation algorithm that would outperform FastICA. The goal of this work is to put forward such a method, which we refer to as Robus- tICA, and compare it with FastICA. The new method simply consists of carrying out exact line search of the contrast function, the normalized kurtosis [12]. Exact line search is achieved at low cost, since the optimal step size (OS) leading to the global maximum along the search direction can algebraically be found at each iter- ation among the roots of a low-degree polynomial. The OS methodology, which has already been proposed in the time equalization context [13, 14, 15, 16], can be used in conjunction with a variety of alternative crite- ria such as the constant modulus [17] and the constant power [14, 18]. As part of our experimental study, we evaluate the computational complexity required to reach a given source extraction performance. The algorithms’ speed and eﬃciency can thus be compared objectively. It is now generally acknowledged that adaptive (also known as on-line, recursive or sample-by-sample) al- gorithms are not always computationally cheaper than block (oﬀ-line, windowed) algorithms, and that they are rarely better in terms of precision. On this account, block implementations are the focus of this paper. 2. MODEL AND NOTATION Let an L-dimensional random vector x denote the ob- servation, which is assumed to stem from the linear sta- 14th European Signal Processing Conference (EUSIPCO 2006), Florence, Italy, September 4-8, 2006, copyright by EURASIP