Manifold Clustering via Energy Minimization Qiyong Guo 1 , Hongyu Li 1,3 , Wenbin Chen 2 , I-Fan Shen 1 and Jussi Parkkinen 3 1 Department of Computer Science and Engineering 2 Department of Mathematics Fudan University, Shanghai, China 3 Department of Computer Science and Statistics University of Joensuu, Joensuu, Finland Abstract Manifold clustering aims to partition a set of input data into several clusters each of which contains data points from a separate, simple low-dimensional manifold. This paper presents a novel solution to this problem. The proposed algorithm begins by randomly selecting some neighboring orders of the input data and defining an energy function that is described by geometric features of underlying man- ifolds. By minimizing such energy using the tabu search method, an approximately optimal sequence could be found with ease, and further different manifolds are separated by detecting some crucial points, boundaries between mani- folds, along the optimal sequence. We have applied the pro- posed method to both synthetic data and real image data and experimental results show that the method is feasible and promising in manifold clustering. 1. Introduction Manifold learning from unorganized data has received a lot of attention in machine learning and pattern recogni- tion communities due to its potential in practical applica- tion. Popular methods to solve this problem include isomet- ric feature mapping (Isomap) [11], locally linear embedding (LLE)[8, 7], Hessian LLE[4], Laplacian Eigenmaps [1], semi-definite embedding (SDE)[12], local tangent space alignment (LTSA)[15] and some others [2]. Most of them share a basic framework, consisting of three steps: (i) com- puting neighborhood of data in the input space, (ii) con- structing a correlation matrix for input data, and (iii) finding low-dimensional manifold embedding via the top or bot- tom eigenvectors of this matrix. If further grouping the manifold embedding using some traditional clustering algo- rithms such as K-means, one could implement the so-called spectral clustering method as in [9, 6]. For broader application, manifold clustering has been proposed in [10] as an extension of manifold learning to classify unorganized data nearly lying on multiple low- dimensional underlying manifolds. The algorithm in [10] first computes geodesic distances like in Isomap, and then introduces the expectation-maximization (EM) approach to cluster points in terms of geodesic distances. Wu and Chan [14] also suggested an extended Isomap algorithm for man- ifold clustering, which computes within-class and between- class geodesic distances separately and obtains the final clustering from the augmented geodesic distance matrix by the MDS algorithm. Yankov and Keogh [13] also introduce a modified Isomap algorithm so as to group shape data into several underlying manifolds and discover the intrinsic non- linearity in shape data. In [3], Cao and Haralick described a nonlinear manifold clustering algorithm that is based on geometrical invariance of data. Until now, however, none of these techniques can be considered a perfect solution to the problem of manifold clustering, and therefore the explo- ration in this field is still an active area of research. Unlike manifold learning that primarily discovers a man- ifold embedding of data like dimension reduction tech- niques, manifold clustering aims to partition a set of data into several different clusters each of which contains data points originating from a separate, simple low-dimensional manifold. As a clustering approach, in nature, manifold clustering can be implemented with no need of beforehand dimension reduction or manifold learning. In our work, it is accomplished by solving an energy optimization problem via the tabu search method. At the same time, to embody characteristics of underlying manifolds in the process of clustering, we employ such geometrical properties of data as discrete curvature to describe the energy function. Our main contribution in this paper is to propose a new framework for clustering multiple manifolds via energy minimization. At present, this work only considers some simple cases in which multiple 1-D or 2-D manifolds are put together in a 2- or 3-D space. More complex cases can be settled by slightly modifying the definition of energy func- tion, which is the topic of our future work. Sixth International Conference on Machine Learning and Applications 0-7695-3069-9/07 $25.00 © 2007 IEEE DOI 10.1109/ICMLA.2007.43 375 Sixth International Conference on Machine Learning and Applications 0-7695-3069-9/07 $25.00 © 2007 IEEE DOI 10.1109/ICMLA.2007.43 375 Sixth International Conference on Machine Learning and Applications 0-7695-3069-9/07 $25.00 © 2007 IEEE DOI 10.1109/ICMLA.2007.43 375 Sixth International Conference on Machine Learning and Applications 0-7695-3069-9/07 $25.00 © 2007 IEEE DOI 10.1109/ICMLA.2007.43 375 Sixth International Conference on Machine Learning and Applications 0-7695-3069-9/07 $25.00 © 2007 IEEE DOI 10.1109/ICMLA.2007.43 375 Sixth International Conference on Machine Learning and Applications 0-7695-3069-9/07 $25.00 © 2007 IEEE DOI 10.1109/ICMLA.2007.43 375