International Journal of Neural Systems, Vol. 17, No. 6 (2007) 431–446 c World Scientific Publishing Company MULTILAYER NONNEGATIVE MATRIX FACTORIZATION USING PROJECTED GRADIENT APPROACHES ANDRZEJ CICHOCKI and RAFAL ZDUNEK Laboratory for Advanced Brain Signal Processing RIKEN Brain Science Institute, Wako-shi Saitama 351-0198, Japan cia@brain.riken.jp zdunek@brain.riken.jp The most popular algorithms for Nonnegative Matrix Factorization (NMF) belong to a class of mul- tiplicative Lee-Seung algorithms which have usually relative low complexity but are characterized by slow-convergence and the risk of getting stuck to in local minima. In this paper, we present and com- pare the performance of additive algorithms based on three different variations of a projected gradient approach. Additionally, we discuss a novel multilayer approach to NMF algorithms combined with multi- start initializations procedure, which in general, considerably improves the performance of all the NMF algorithms. We demonstrate that this approach (the multilayer system with projected gradient algo- rithms) can usually give much better performance than standard multiplicative algorithms, especially, if data are ill-conditioned, badly-scaled, and/or a number of observations is only slightly greater than a number of nonnegative hidden components. Our new implementations of NMF are demonstrated with the simulations performed for Blind Source Separation (BSS) data. Keywords : Nonnegative matrix factorization; NMF; multilayer NMF; projected gradient algorithms; BSS. 1. Introduction NMF and its extended versions, nonnegative matrix deconvolution (NMD), and nonnegative tensor factorization (NTF) are relatively new and promising techniques with many potential scientific and engineering applications including: classification, 15 clustering and segmentation of patterns, 610 dimensionality reduction, 11,12 face or object recognition, 1215 spectra recovering, 1618 lan- guage modeling, speech processing, data mining and data analysis, e.g., text analysis 19,20 and music transcription. 5,17,21 NMF is often able to recover hidden structures in the data, and to provide biological insight. Depend- ing on an application, the hidden structures may have different interpretation. For example, Lee and Seung in Ref. 6 introduced NMF as a method to decompose an image (face) into parts-based repre- sentations (parts reminiscent of features such as lips, eyes, nose, etc.). In blind source separation, 22,23 the recovered components are unknown hidden (lateral) nonnegative components that cannot be observed directly. In many cases, NMF performs dimensional- ity reduction, and the retrieval components in a low- dimensional space have the similar interpretation (pattern analysis) as, e.g., the components obtained with PCA. The simplest linear model used in NMF is of the form: Y = AX + V , (1) On leave from Warsaw University of Technology, IBS PAN, Polish Academy of Science, Warsaw, Poland. On leave from Institute of Telecommunications, Teleinformatics, and Acoustics, Wroclaw University of Technology, Poland. 431