Received 22 February 2023, accepted 12 March 2023, date of publication 22 March 2023, date of current version 29 March 2023. Digital Object Identifier 10.1109/ACCESS.2023.3260652 Community Detection Algorithms in Healthcare Applications: A Systematic Review MEHRDAD ROSTAMI 1 , MOURAD OUSSALAH 1,2 , (Senior Member, IEEE), KAMAL BERAHMAND 3 , AND VAHID FARRAHI 1,2 1 Center for Machine Vision and Signal Analysis (CMVS), Faculty of Information Technology and Electrical Engineering, University of Oulu, 90570 Oulu, Finland 2 Research Unit of Health Sciences and Technology, Faculty of Medicine, University of Oulu, 90570 Oulu, Finland 3 School of Computer Sciences, Science and Engineering Faculty, Queensland University of Technology (QUT), Brisbane, QLD 4000, Australia Corresponding author: Mehrdad Rostami (Mehrdad.Rostami@oulu.fi) This work was supported in part by the Academy of Finland Profi5 DigiHealth-Project, a Strategic Profiling Program at the University of Oulu under Project 326291; and in part by the Ministry of Education and Culture under Grant OKM/20/626/2022 and Grant OKM/76/626/2022. ABSTRACT Over the past few years, the number and volume of data sources in healthcare databases has grown exponentially. Analyzing these voluminous medical data is both opportunity and challenge for knowledge discovery in health informatics. In the last decade, social network analysis techniques and community detection algorithms are being used more and more in scientific fields, including healthcare and medicine. While community detection algorithms have been widely used for social network analysis, a comprehensive review of its applications for healthcare in a way to benefit both health practitioners and the health informatics community is still overwhelmingly missing. This paper contributes to fill in this gap and provide a comprehensive and up-to-date literature research. Especially, categorizations of existing community detection algorithms are presented and discussed. Moreover, most applications of social network analysis and community detection algorithms in healthcare are reviewed and categorized. Finally, publicly available healthcare datasets, key challenges, and knowledge gaps in the field are studied and reviewed. INDEX TERMS Social network analysis, community detection, graph theory, healthcare application, medical data analysis. I. INTRODUCTION A. SOCIAL NETWORKS AND HEALTHCARE The amount and number of accumulated data in healthcare databases have increased substantially in the last decade [1], [2]. This is mainly due to the increased digitalization trend and the growing enforcement policies for maintaining patient data either in dynamic databases or offline repositories for a limited time period. Typically, large-scale healthcare data arise from at least four primary sources: Electronic Health Record [3], Medical Imaging Data [4], Unstructured Clinical Notes [5], and Genetic Data [6]. Intuitively, analyzing such a huge amount of data would generate new opportunities for knowledge discovery enables us to gain new insights, and generate new hypotheses that can eventually lead to improvement in health services. The use of e-Health has also The associate editor coordinating the review of this manuscript and approving it for publication was Kostas Kolomvatsos . been promoted by World Health Organization (WHO) as an efficient way to achieve the 3 rd United Nation Sustainable Development Goal of Good Health and Well-being [7]. Electronic Health Records, Patient Health Records, and Mobile Health are all subdomains of e-Health. The goal of e-Health is to provide affordable, secure, and efficient health services through digital and smart technologies. The use of e-Health ultimately benefits both individuals and organizations. For example, e-Health data analysis facil- itates the delivery of health care services in developing countries [8], [9]. Although the massive amount of healthcare data provides valuable opportunities to improve patient health experience, it also prompts various sources of uncertainties (e.g., missing data, spelling errors, inefficient updates, anomalous patterns), which may lead to abusive activities and access restriction [17], [18], [19], [20]. The discrepancy and variety of the sources of data used in healthcare create additional challenges VOLUME 11, 2023 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ 30247