A New Kernel Non-Negative Matrix Factorization
and Its Application in Microarray Data Analysis
Yifeng Li and Alioune Ngom
Abstract— Non-negative factorization (NMF) has been a pop-
ular machine learning method for analyzing microarray data.
Kernel approaches can capture more non-linear discriminative
features than linear ones. In this paper, we propose a novel
kernel NMF (KNMF) approach for feature extraction and clas-
sification of microarray data. Our approach is also generalized
to kernel high-order NMF (HONMF). Extensive experiments
on eight microarray datasets show that our approach generally
outperforms the traditional NMF and existing KNMFs. Prelim-
inary experiment on a high-order microarray data shows that
our KHONMF is a promising approach given a suitable kernel
function.
Index Terms— Kernel Non-Negative Matrix Factorization,
Microarray Data, Classification, Feature Extraction.
I. I NTRODUCTION
N
ON-NEGATIVE matrix factorization (NMF) has been
an important machine learning approach since the work
of Lee and Seung [1]. It generally decomposes a non-negative
matrix ∈ ℝ
×
into two -rank ( ≤ ,) non-negative
factors ∈ ℝ
×
and ∈ ℝ
×
, as formulated in
Equation 1:
+
≈
+
+
, (1)
where
+
indicates matrix is non-negative. Each col-
umn of is approximated by a nonlinear combination of
columns of , where the coefficient is the corresponding col-
umn in , therefore is called basis matrix, and is called
coefficient matrix. NMF sometimes generates sparse factors
which is very useful for interpretation. Optimization algo-
rithms, such as multiple update rules [2] and non-negative
least squares [3], have been devised to solve the non-convex
problem in Equation 1. Many variants, including sparse-NMF
[4], semi-NMF [5], convex-NMF [5], orthogonal-NMF [6],
and weighted-NMF [7], have been proposed in literature.
Two kernel NMF (KNMF) extensions have been proposed
in [17] and [5]. We shall introduce these two approaches in
Section II. NMF can be applied as clustering [8], feature
extraction [9], feature selection [10], and classification [11]
approaches. NMF has also been generalized to high-order
NMF(HONMF) to factorize tensor data in [12]. The defini-
tion of tensor will be give later.
Microarray technique has been developing for over one
decade [13]. It can conveniently monitor the activities of
thousands of genes through measuring the abundance of
Yifeng Li and Alioune Ngom are with the School of Computer Science,
University of Windsor, Windsor, Ontario, Canada (email: {li11112c, an-
gom}@uwindsor.ca).
This research has been supported by IEEE CIS Walter Karplus Summer
Research Grant 2010, Ontario Graduate Scholarship 2011-2012, and Cana-
dian NSERC Grants #RGPIN228117-2011.
the corresponding mRNA. Numerous microarray datasets
have been produced from diverse tissues and species under
different conditions for various purposes. We categorize them
into three types. If the gene expression levels of different
samples are measured once, this results in the static gene-
sample data. If the snap-shots of the gene activities of one
or multiple similar samples are taken in a sequence of time
points, a gene-time-series dataset is produced. The third type
is called high-order tensor data which are much more com-
plicated. The definition of tensor in tensor/multilinear algebra
is the generalization of matrix and vector from matrix/linear
algebra [14]. The order of a tensor is the number of axes
needed to hold it. A vector is an 1-order tensor. A matrix
is a 2-order tensor. The aforementioned gene-sample and
gene-time data are hence 2-order tensors. A gene-sample-
time (GST) dataset is a 3-order tensor. GST data are the
combination of gene-sample and gene-time data. It can be
defined as the gene expression levels of different samples
are measured across the time. For each sample, it forms
a gene-time matrix. Microarray technique has been widely
applied in laboratories for genomic studies and medical
diagnosis. Machine learning is the main computational tool
to analyze microarray data. Clustering samples or genes
can discover subtypes of a disease and genomic patterns.
Feature selection can be applied to biomarker identification.
New discriminative features as the combination of existing
features can be generated by feature extraction. Classification
approaches coupled with feature selection or feature extrac-
tion are applied to predict diseases. However, it has many
issues in microarray data. The issues include high noise,
missing values, high dimensionality, sparse and few sampling
time points, to name a few. These issues led to many
challenging computational problems such as low accuracy,
expensive computational cost, mathematical difficulty, poor
scalability, and so on. NMF has been applied as an important
machine learning tool in the aspects of clustering [8], feature
extraction [10], feature selection [10], and classification
[15], for microarray data analysis. HONMF has also been
used as a novel feature extraction method of GST data
in drug/dose response prediction [16]. Generally speaking,
kernel approaches can capture more nonlinear information
than their linear counterparts, and therefore might improve
the performance of applications. In this paper, we proposed
a new kernel approach which is the extension of semi-NMF,
and applied it to feature extraction and classification for
gene-sample data. We also propose an approach of kernel
HONMF, and use it as feature extraction method for GST
data.
978-1-4673-1191-5/12/$31.00 ©2012 IEEE
371