E. Corchado et al. (Eds.): HAIS 2012, Part I, LNCS 7208, pp. 255–266, 2012. © Springer-Verlag Berlin Heidelberg 2012 A Max Metric to Evaluate a Cluster Hosein Alizadeh 1 , Hamid Parvin 2 , Sajad Parvin 2 , Zahra Rezaei 2 , and Moslem mohamadi 2 1 Islamic Azad University, Mahdi Shahr Branch, Mahdi Shahr, Iran halizadeh@iust.ac.ir 2 Islamic Azad University, Nourabad Mamasani Branch, Mamasani Nourabad, Iran hamidparvin@mamasaniiau.ac.ir, {s.parvin,rezaei,mohamadi}@iust.ac.ir Abstract. In this paper a new criterion for clusters validation is proposed. This new cluster validation criterion is used to approximate the goodness of a cluster. The clusters which satisfy a threshold of the proposed measure are selected to participate in clustering ensemble. To combine the chosen clusters, some methods are employed as aggregators. Employing this new cluster validation criterion, the obtained ensemble is evaluated on some well-known and standard datasets. The empirical studies show promising results for the ensemble obtained using the proposed criterion comparing with the ensemble obtained using the standard clusters validation criterion. Besides to reach the best results, the method gives an algorithm based on which one can find how to select the best subset of clusters from a pool of clusters. Keywords: Clustering Ensemble, Stability Measure, Extended EAC, Co-association Matrix, Cluster Evaluation. 1 Introduction Data clustering or unsupervised learning is an important and very difficult problem. The objective of clustering is to partition a set of unlabeled objects into homogeneous groups or clusters [3], [4] and [10]. There are many applications that use clustering techniques to discover latent structures of data, such as data mining [11], information retrieval [2], image segmentation [9], linkage learning [15], and machine learning. In real-world problems, clusters can appear with different shapes, sizes, data sparseness’s, and degrees of separation. Clustering techniques require the definition of a similarity measure between patterns. Since there is no prior knowledge about cluster shapes, choosing a specific clustering method is not easy [16]. Studies in the last few years have tended to combinational methods. Cluster ensemble methods attempt to find better and more robust clustering solutions by fusing information from several primary data partitions [8]. Fern and Lin [8] have suggested a clustering ensemble approach which selects a subset of solutions to form a smaller but better-performing cluster ensemble than using all primary solutions. The ensemble selection method is designed based on quality and diversity, the two factors that have been shown to influence cluster