Manifold Clustering via Energy Minimization
Qiyong Guo
1
, Hongyu Li
1,3
, Wenbin Chen
2
, I-Fan Shen
1
and Jussi Parkkinen
3
1
Department of Computer Science and Engineering
2
Department of Mathematics
Fudan University, Shanghai, China
3
Department of Computer Science and Statistics
University of Joensuu, Joensuu, Finland
Abstract
Manifold clustering aims to partition a set of input data
into several clusters each of which contains data points from
a separate, simple low-dimensional manifold. This paper
presents a novel solution to this problem. The proposed
algorithm begins by randomly selecting some neighboring
orders of the input data and defining an energy function
that is described by geometric features of underlying man-
ifolds. By minimizing such energy using the tabu search
method, an approximately optimal sequence could be found
with ease, and further different manifolds are separated by
detecting some crucial points, boundaries between mani-
folds, along the optimal sequence. We have applied the pro-
posed method to both synthetic data and real image data
and experimental results show that the method is feasible
and promising in manifold clustering.
1. Introduction
Manifold learning from unorganized data has received
a lot of attention in machine learning and pattern recogni-
tion communities due to its potential in practical applica-
tion. Popular methods to solve this problem include isomet-
ric feature mapping (Isomap) [11], locally linear embedding
(LLE)[8, 7], Hessian LLE[4], Laplacian Eigenmaps [1],
semi-definite embedding (SDE)[12], local tangent space
alignment (LTSA)[15] and some others [2]. Most of them
share a basic framework, consisting of three steps: (i) com-
puting neighborhood of data in the input space, (ii) con-
structing a correlation matrix for input data, and (iii) finding
low-dimensional manifold embedding via the top or bot-
tom eigenvectors of this matrix. If further grouping the
manifold embedding using some traditional clustering algo-
rithms such as K-means, one could implement the so-called
spectral clustering method as in [9, 6].
For broader application, manifold clustering has been
proposed in [10] as an extension of manifold learning to
classify unorganized data nearly lying on multiple low-
dimensional underlying manifolds. The algorithm in [10]
first computes geodesic distances like in Isomap, and then
introduces the expectation-maximization (EM) approach to
cluster points in terms of geodesic distances. Wu and Chan
[14] also suggested an extended Isomap algorithm for man-
ifold clustering, which computes within-class and between-
class geodesic distances separately and obtains the final
clustering from the augmented geodesic distance matrix by
the MDS algorithm. Yankov and Keogh [13] also introduce
a modified Isomap algorithm so as to group shape data into
several underlying manifolds and discover the intrinsic non-
linearity in shape data. In [3], Cao and Haralick described
a nonlinear manifold clustering algorithm that is based on
geometrical invariance of data. Until now, however, none
of these techniques can be considered a perfect solution to
the problem of manifold clustering, and therefore the explo-
ration in this field is still an active area of research.
Unlike manifold learning that primarily discovers a man-
ifold embedding of data like dimension reduction tech-
niques, manifold clustering aims to partition a set of data
into several different clusters each of which contains data
points originating from a separate, simple low-dimensional
manifold. As a clustering approach, in nature, manifold
clustering can be implemented with no need of beforehand
dimension reduction or manifold learning. In our work, it
is accomplished by solving an energy optimization problem
via the tabu search method. At the same time, to embody
characteristics of underlying manifolds in the process of
clustering, we employ such geometrical properties of data
as discrete curvature to describe the energy function.
Our main contribution in this paper is to propose a new
framework for clustering multiple manifolds via energy
minimization. At present, this work only considers some
simple cases in which multiple 1-D or 2-D manifolds are put
together in a 2- or 3-D space. More complex cases can be
settled by slightly modifying the definition of energy func-
tion, which is the topic of our future work.
Sixth International Conference on Machine Learning and Applications
0-7695-3069-9/07 $25.00 © 2007 IEEE
DOI 10.1109/ICMLA.2007.43
375
Sixth International Conference on Machine Learning and Applications
0-7695-3069-9/07 $25.00 © 2007 IEEE
DOI 10.1109/ICMLA.2007.43
375
Sixth International Conference on Machine Learning and Applications
0-7695-3069-9/07 $25.00 © 2007 IEEE
DOI 10.1109/ICMLA.2007.43
375
Sixth International Conference on Machine Learning and Applications
0-7695-3069-9/07 $25.00 © 2007 IEEE
DOI 10.1109/ICMLA.2007.43
375
Sixth International Conference on Machine Learning and Applications
0-7695-3069-9/07 $25.00 © 2007 IEEE
DOI 10.1109/ICMLA.2007.43
375
Sixth International Conference on Machine Learning and Applications
0-7695-3069-9/07 $25.00 © 2007 IEEE
DOI 10.1109/ICMLA.2007.43
375