Evaluation Metrics for Overlapping Community Detection Safa El Ayeb *† , Baptiste Hemery * , Fabrice Jeanne * , Estelle Cherrier and Christophe Charrier * Orange, Caen, France Email: {safa.elayeb, baptiste.hemery, fabrice.jeanne}@orange.com Normandie Univ, UNICAEN, ENSICAEN, CNRS, GREYC, 14000 Caen, France Email: {estelle.cherrier, christophe.charrier}@ensicaen.fr Abstract—Networks have provided a representation for a wide range of real systems, including communication flow, money transfer or biological systems, to mention just a few. Communities represent fundamental structures for understanding the organi- zation of real-world networks. Uncovering coherent groups in these networks is the goal of community detection. A community is a mesoscopic structure with nodes heavily connected within their groups by comparison to the nodes in other groups. Communities might also overlap as they may share one or multiple nodes. Evaluating the results of a community detection algorithm is an equally important task. This paper introduces metrics for evaluating overlapping community detection. The idea of introducing new metrics comes from the lack of efficiency and adequacy of state-of-the-art metrics for overlapping communities. The new metrics are tested both on simulated data and standard datasets and are compared with existing metrics. Index Terms—Social Network Analysis, Overlapping commu- nity detection, evaluation metric. I. I NTRODUCTION Social network analysis has received tremendous attention over the past decade. Its main objective is understanding individual behaviors, based on their interactions. Network analysis has attracted significant interest due to its potential to handle many real-world case studies [1], [2]. In particular, community detection has become a fundamental and highly relevant research area in network science [3]. Therefore, a substantial number of community detection algorithms have been developed, across varied disciplines such as statistics, physics, biology, sociology, etc. The result of community detection is a partition with disjoint, overlapping, fuzzy, or hierarchical communities. To evaluate and compare community detection algorithms, the literature has given much attention to evaluation metrics [4], [5]. Evaluation metrics can be either quality metrics that assess structural quality of communities, or information recovery metrics that compare the result to a gold standard, also called ground-truth. Despite the number of evaluation metrics in the literature, very few are applicable to overlapping communities. Having a simple and easy to interpret metric is of importance when dealing with community detection algorithms. In this paper, we propose four information recovery metrics for overlapping community detection results. Each of the The authors would like to thanks Orange and the ANRT for funding this work. proposed metrics considers a specific aspect of the network and is designed to provide a clear explanation. Our goal is to overcome the classical drawbacks of standard information recovery metrics, namely the difficulty to interpret the results. This paper is organized as follows. Section II presents pre- liminary definitions about community detection and evaluation metrics. In section III, we illustrate the proposed metrics and their properties. Finally, section IV analyses several tests of the performance of proposed metrics both on synthetic and real-world networks. II. BACKGROUND A. Overlapping communities detection One of the most important application of networks’ analysis relies on the search for dense groups, also called communities. Community detection in networks has aroused a lot of interest during the last decade [2], [3], [5]. Although community is not an accurately defined concept, a general consensus implies that a community represents a group of densely connected vertices, either sharing some properties or playing similar roles inside the network as stated by the authors of [5]. Depending on the characteristics of the network, the result of community detection may lead to disjoint communities, overlapping communities, dynamic communities, etc. Although most of the work in the literature is focused on disjoint communities, more efforts are oriented toward overlapping communities. In this paper, we are particularly focused on overlapping communities’ detection. Unlike crisp communities, overlapping communities may share one or more nodes. A node can simultaneously be part of multiple communities of different scopes and levels, such as family, friends, work, city, etc. [6]. Overlapping communities were studied in the literature in various contexts such as biology [7], e-commerce [8], mobile networks [2], etc. For a complete study of overlapping community detection, we refer the reader to [9]. B. Evaluation Measures One of the biggest challenges related to community detec- tion is the ability to evaluate the generated results. Evaluation is a real issue for real networks where only little data are provided. Evaluation metrics in this area can be employed either to assess the performance of a community detection 978-1-6654-8001-7/22/$31.00 ©2022 IEEE 355 2022 IEEE 47th Conference on Local Computer Networks (LCN) | 978-1-6654-8001-7/22/$31.00 ©2022 IEEE | DOI: 10.1109/LCN53696.2022.9843473