JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 26, 649-658 (2010) 649 A Novel Spectral Clustering Method Based on Pairwise Distance Matrix CHI-FANG CHIN 1 , ARTHUR CHUN-CHIEH SHIH 2 AND KUO-CHIN FAN 1,3 1 Institute of Computer Science and Information Engineering National Central University Chungli, 320 Taiwan E-mail: annking@iis.sinica.edu.tw 2 Institute of Information Science Academia Sinica Taipei, 115 Taiwan E-mail: arthur@iis.sinica.edu.tw 3 Department of Informatics Fo Guang University Ilan, 262 Taiwan E-mail: kcfan@csie.ncu.edu.tw In general, the similarity measure is indispensable for most traditional spectral clus- tering algorithms since these algorithms typically begin with the pairwise similarity ma- trix of a given dataset. However, a general type of input for most clustering applications is the pairwise distance matrix. In this paper, we propose a distance-based spectral clus- tering method which makes no assumption on regarding both the suitable similarity measure and the prior-knowledge of cluster number. The kernel of distance-based spec- tral clustering is that the symmetric LoG weighted matrix constructed by applying the Laplace operator to the pairwise distance matrix. The main difference from the traditional spectral clustering is that the pairwise distance matrix can be directly employed without transformation as a similarity pairwise matrix in advance. Moreover, the inter-cluster structure is embedded and the intra-cluster pairwise relationships are maximized in the proposed method to increase the discrimination capability on extracting clusters. Ex- periments were conducted on different types of test datasets and the results demonstrate the correctness of the extracted clusters. Furthermore, the proposed method is also veri- fied to be robust to noisy datasets. Keywords: spectral clustering, laplace operator, LoG weighted matrix, pairwise distance matrix, PoD histogram 1. INTRODUCTION Clustering technique plays an important role for data analysis in many fields. Abun- dant of clustering algorithms have been reported in literature. However, the dependence of clustering results on data distribution and the prerequisite of a prior-knowledge of cluster number are two open problems [1-3]. Over the past years, the research on spectral clustering has received tremendous attentions [4-8]. The most recent survey is given in [8]. The prototype of spectral clustering was first presented by Donath-Hoffman [9] and Fiedler [10, 11]. They introduced the original idea of using eigenvalue decomposition for graph partition. The kernel of spectral clustering is a graph Laplacian matrix, which is derived from a similarity graph constructed from a set of data points. Fiedler [10] had Received March 14, 2008; revised May 23, 2008; accepted June 5, 2008. Communicated by H. Y. Mark Liao.