Improved MinMax Cut Graph Clustering with Nonnegative Relaxation Feiping Nie, Chris Ding, Dijun Luo, and Heng Huang Department of Computer Science and Engineering, University of Texas, Arlington, America {feipingnie,dijun.luo}@gmail.com,{chqding,heng}@uta.edu Abstract. In graph clustering methods, MinMax Cut tends to provide more balanced clusters as compared to Ratio Cut and Normalized Cut. The traditional approach used spectral relaxation to solve the graph cut problem. The main disadvantage of this approach is that the obtained spectral solution has mixed signs, which could severely deviate from the true solution and have to resort to other clustering methods, such as K-means, to obtain final clusters. In this paper, we propose to apply ad- ditional nonnegative constraint into MinMax Cut graph clustering and introduce novel algorithms to optimize the new objective. With the ex- plicit nonnegative constraint, our solutions are very close to the ideal class indicator matrix and can directly assign clusters to data points. We present efficient algorithms to solve the new problem with the non- negative constraint rigorously. Experimental results show that our new algorithm always converges and significantly outperforms the traditional spectral relaxation approach on ratio cut and normalized cut. Keywords: Spectral clustering, Normalized cut, MinMax cut, Nonneg- ative relaxation, cluster balance, random graphs. 1 Introduction Clustering is an important task in machine learning and data mining areas. In the past decades, many clustering algorithms have been proposed such as K-means clustering, spectral clustering and its variants [1,2,3], support vector clustering [4], and maximum margin clustering [5,6,7]. Among them, the use of manifold information in graph cut clustering has shown the state-of-the-art clustering performance and been widely applied into many applications, such as image segmentation [8], white matter fiber tracking in biomedical image [9], and protein sequence clustering [10]. MinMax Cut was proposed in [11] and showed more compact and balanced clustering results than Ratio Cut [12] and Normalized Cut [8]. Because, in Min- Max Cut method, the within-cluster similarities are explicitly maximized. Solv- ing the graph cut clustering problem is a nontrivial task. The main difficulty of the graph clustering problem lies in the constraints on the solution. In order to make the problem tractable, the constraints should be relaxed. Traditional J.L. Balc´azar et al. (Eds.): ECML PKDD 2010, Part II, LNAI 6322, pp. 451–466, 2010. c Springer-Verlag Berlin Heidelberg 2010