IEEE SIGNAL PROCESSING LETTERS, VOL. 6, NO. 4, APRIL 1999 87 Blind Source Separation of More Sources Than Mixtures Using Overcomplete Representations Te-Won Lee, Member, IEEE, Michael S. Lewicki, Mark Girolami, Member, IEEE, and Terrence J. Sejnowski, Senior Member, IEEE Abstract—Empirical results were obtained for the blind source separation of more sources than mixtures using a recently pro- posed framework for learning overcomplete representations. This technique assumes a linear mixing model with additive noise and involves two steps: 1) learning an overcomplete representation for the observed data and 2) inferring sources given a sparse prior on the coefficients. We demonstrate that three speech signals can be separated with good fidelity given only two mixtures of the three signals. Similar results were obtained with mixtures of two speech signals and one music signal. Index Terms—Blind source separation, independent component analysis, overcomplete dictionary, overcomplete representation, speech signal separation. I. INTRODUCTION R ECENT advances in blind source separation by inde- pendent component analysis (ICA) have many potential applications including speech recognition systems, telecom- munications, and medical signal processing. The goal of ICA is to recover independent sources given only sensor obser- vations that are unknown linear mixtures of the unobserved independent source signals [3]–[5], [8]. The standard formulation of ICA requires at least as many sensors as sources. Lewicki and Sejnowski [9], [11] have proposed a generalized ICA method for learning overcomplete representations of the data that allows for more basis vectors than dimensions in the input. The goal of this method is illustrated in Fig. 1. In a two-dimensional (2-D) data space, the observations in Fig. 1(a) and (b) were generated by a linear mixture of two independent random sparse sources. In this space, Fig. 1(a) shows orthogonal basis vectors (principle component analysis, PCA) and Fig. 1(b) shows independent basis vectors. If the 2-D observed data are generated by three sparse sources, as shown in Fig. 1(c) and (d), the complete ICA Manuscript received August 31, 1998. The associate editor coordinating the review of this manuscript and approving it for publication was Prof. K. Buckley. T.-W. Lee is with Howard Hughes Medical Institute, Computational Neu- robiology Laboratory, The Salk Institute, La Jolla, CA 92037 USA (e-mail: tewon@salk.edu; lewicki@salk.edu). M. S. Lewicki was with Howard Hughes Medical Institute, Computational Neurobiology Laboratory, The Salk Institute, La Jolla, CA 92037 USA. He is now with Carnegie Mellon University, Pittsburgh, PA 15213 USA. M. Girolami is with the Department of Computing and Information Systems, University of Paisley, Paisley PA1 2BE, U.K. (e-mail: giro0ci@paisley.ac.uk). T. J. Sejnowski is with Howard Hughes Medical Institute, Computational Neurobiology Laboratory, The Salk Institute, La Jolla, CA 92037 USA, and also with the Department of Biology, University of California, San Diego, La Jolla, CA 92093 USA. Publisher Item Identifier S 1070-9908(99)02461-X. (a) (b) (c) (d) Fig. 1. Illustration of basis vectors in a 2-D data space with two sparse sources (top) or three sparse sources (bottom). (a) PCA finds orthogonal basis vectors. (b) ICA representation finds independent basis vectors. (c) ICA cannot model the data distribution adequately with three sources, but (d) the overcomplete ICA representation finds three basis vectors that match the underlying data distribution (see [11]). representation (c) cannot model the data adequately but the overcomplete ICA representation (d) finds three basis vectors that fit the underlying distribution of the data. In this letter, the learning rules for overcomplete ICA are briefly summarized in Section II, as derived by Lewicki and Sejnowski [11]. In Section III, simulation results are presented for speech signals and music signals. The discussion in Section IV covers related work and future research issues. II. LEARNING OVERCOMPLETE REPRESENTATIONS The observed -dimensional data may be modeled as a linear overcomplete mixing matrix, , ( ) 1 with additive noise. (1) 1 In most ICA formulations, the matrix is restricted , which is not imposed here. 1070–9908/99$10.00  1999 IEEE