International Journal of Neural Systems, Vol. 0, No. 0 (April, 2000) 00–00 c  World Scientiﬁc Publishing Company ADAPTIVE K-MEANS ALGORITHM FOR OVERLAPPED GRAPH CLUSTERING Gema Bello-Orgaz†, H´ ector D. Men´ endez‡and David Camacho * Computer Science Department Escuela Politecnica Superior, Universidad Aut´ onoma de Madrid 28049, Madrid, Spain †gema.bello@uam.es ‡hector.menendez@uam.es * david.camacho@uam.es Received (to be inserted Revised by Publisher) The graph clustering problem has become highly relevant due to the growing interest of several research communities in social networks and their possible applications. Overlapped graph clustering algorithms try to ﬁnd subsets of nodes that can belong to diﬀerent clusters. In social-based applications it is quite usual for a node of the network to belong to diﬀerent groups, or communities, in the graph. Therefore, algorithms trying to discover, or analyse, the behaviour of these networks need to handle this feature, detecting and identifying the overlapped nodes. This paper shows a soft clustering approach based on a genetic algorithm where a new encoding is designed to achieve two main goals. First, the automatic adaptation of the number of communities that can be detected. Second, the deﬁnition of several ﬁtness functions that guide the searching process using some measures extracted from graph theory. Finally, our approach has been experimentally tested using the Eurovision contest dataset, a well-known social-based data network, to show how overlapped communities can be found using our method. Keywords : graph clustering, overlapped clustering, genetic algorithms, clustering coeﬃcient, community ﬁnding, social networks. 1. Introduction The clustering problem can be described as a blind search on a collection of unlabelled data, where elements with similar features are grouped together in sets. There are three main techniques to deal with the clustering problem 32 : overlapping 12 (or non- exclusive), partitional 42 and hierarchical 37 . Over- lapping clustering allows each element to belong to multiple clusters, partitional clustering consists in a disjoint division of the data where each element belongs only to a single cluster, and hierarchical clus- tering nests the clusters formed through a partitional clustering method creating bigger partitions, group- ing the clusters by hierarchical levels. In this work, the approach is focused in the overlapping clustering techniques trying to “relax” a well-known classical partitional technique named K-means using a genetic algorithm approach. K-means is a clustering algo- rithm that uses a ﬁxed number (K) of clusters and looks for the best division of the dataset (through a predeﬁned metric or distance) in this number of groups. Several clustering algorithms, such as K-means, have been improved using genetic algorithms 32 .A genetic algorithm is inspired by biological evolution 38 : the possible problem solutions are represented as individuals belonging to a population. The in- dividuals are encoded using a set of chromosomes (called the genotype of the genome). Later these individuals are evolved, during a number of genera- * Corresponding author.