GLIMPSING INDEPENDENT VECTOR ANALYSIS: SEPARATING MORE SOURCES THAN SENSORS USING ACTIVE AND INACTIVE STATES Alireza Masnadi-Shirazi,Wenyi Zhang and Bhaskar D. Rao Department of Electrical and Computer Engineering, University of California, San Diego {amasnadi, w3zhang, brao}@ucsd.edu ABSTRACT In this paper, we explore the problem of separating convolut- edly mixed signals in the overcomplete (degenerate) case of having more sources than sensors. We exploit a common form of nonstationarity, especially present in speech, wherein the signals have silence periods intermittently, hence varying the set of active sources with time. A novel approach is proposed that takes advantage of different combinations of silence gaps in the source signals at each time period. This enables the algorithm to ”glimpse” or listen in the gaps, hence compen- sating for the global degeneracy by allowing it to learn the mixing matrices at periods where it is locally less degenerate. Experiments using simulated and real room recordings were carried out yielding good separation results. Index Terms— Overcomplete systems, independent component analysis, convolutive mixtures, blind source sep- aration 1. INTRODUCTION Frequency domain blind source separation (BSS) methods have been extensively studied to separate convolutedly mixed signals. By computing the short time Fourier transform (STFT), convolution in the time domain translates to linear mixing in the frequency domain enabling the use of Inde- pendent component analysis (ICA) on each frequency bin. However, due to ICA being indeterminate in permutation, further post processing methods have to be used to avoid the permutation problem. Independent vector analysis (IVA) is a method that avoids such permutation issues by utilizing the inner dependencies between the frequency bins. IVA models each individual source as a dependent multivariate symmetric distribution while still maintaining the fundamental assump- tion of BSS that each source is independent from the other. Therefore, one can say that IVA is the multi-dimensional extension of ICA [1]. For standard ICA-based methods, when the number of sources M becomes greater than the number of sensors L (M>L), the process of estimating the mixing matrix and This research was supported by UC MICRO grants 07-034, 08-65 spon- sered by Qualcomm Inc. the sources are not that straightforward. Various methods in the past with different underlying assumptions have been proposed to deal with overcompleteness (degeneracy) in ICA linear mixing. One method uses a maximum likelihood ap- proximation framework for learning the overcomplete mixing matrix and a Laplacian prior for inferring of the sources[2]. Other methods incorporate geometric/probabilistic clustering approaches while relying heavily on sparsity assumptions (at each time mainly one source is active) [3, 4, 5]. All such methods, however, do not take into consideration the tempo- ral dynamic structure of the signals. Most signals of interest in BSS like speech, music and EEG are nonstationary. One common type of nonstationarity, especially present in speech, is that the signals can have inter- mittent silence periods, hence varying the set of active sources with time. Such feature can be used to deal with degenerate BSS. As the set of active sources for each time period de- creases, the degree of degeneracy (M − L) decreases locally. Hence, by exploiting silence gaps, one is actually compen- sating for the global degeneracy by making use of segments where it is locally less degenerate. An approach to model active and inactive intervals for instantaneous linear mixing overcomplete case has been proposed. This method models the sources as a two-mixture of Gaussians with zero means and unknown variances similar to that of independent fac- tor analysis(IFA) [6], and incorporates a Markov model on a hidden variable that controls state of activity or inactivity for each source. A complicated and inefﬁcient three layered hidden variable (one for the Markov state of activity and two as in normal IFA) estimation algorithm based on variational Bayes is implemented [7]. Extending this to IVA for convo- luted mixtures proves to be even more complicated. In our previous work we proposed a simple and efﬁcient algorithm to model the states of activity and inactivity in the presence of noise for the non-denegerate case of convoluted mixing using a simple mixture model[8]. Unlike the method in [7] where the on/off states were embedded in the sources themselves, they were modeled more naturally as controllers turning on and off the columns of the mixing matrices. In this paper we build upon our previous work to facilitate the degenerate case in convoluted mixing. Moreover, Various studies have con- ﬁrmed that human listeners use similar strategies of exploit- 2010 978-1-4244-4296-6/10/$25.00 ©2010 IEEE ICASSP 2010