Machine Learning Based Unified Framework for Diabetes Prediction S M Hasan Mahmud Daffodil International University Dhaka-1207, Bangladesh UESTC, Chengdu- 611731, China hasan.swe@daffodilvarsity.edu. bd Sheak Rashed Haider Noori Department of Computer Science Daffodil International University Dhaka-1207, Bangladesh drnoori@daffodilvarsity.edu.bd Md Altab Hossin Dept. of Management Science & Eng. University of Electronic Science and Technology of China, Chengdu, China altabbd@163.com Md Nazirul Islam Sarkar School of Public Administration Sichuan University Chengdu- 610065 sarker.scu@yahoo.com Md. Razu Ahmed Department of Software Engineering Daffodil International University Dhaka-1207, Bangladesh razu35-1072@diu.edu.bd ABSTRACT Machine learning gained a significant position in healthcare services (HCS) due to its ability to improve the disease prediction in HCS. Machine learning techniques and artificial intelligence have already been worked in the HCS area. Recently, diabetes is a notable public chronic disease worldwide. It is growing rapidly because of bad lifestyles, taking more junk food and also lake of health awareness. Therefore, there is a need of framework that can effectively track and monitor people’s diabetes and health condition within an application view. In this study, we proposed a framework for real time diabetes prediction, monitoring and application (DPMA). Our objective is to develop an optimized and efficient machine learning (ML) application which can effectually recognize and predict the condition of the diabetes. In this work, five most important machine learning classification techniques were considered for predicting diabetes. However, we use different evaluation criteria to investigate the performance of these classification techniques. In addition, performance measurement of the classification techniques was evaluated by applying the 10-fold cross validation method. The analysis results show that Naïve Bayes achieved highest performance than the other classifiers, obtaining the F1 measure of 0.74. CCS Concepts • Computing methodologies➝Machine learning technology, Artificial intelligent; Disease prediction in healthcare services Keywords Machine Learning; Classification; Supervised Learning; Diabetes Prediction; Disease Prediction. 1. INTRODUCTION Diabetes Mellitus (DM) is defined as a group of metabolic diseases in which humans have towering blood sugar levels. Diabetes is a prolonged disease that happens when the body cannot efficiently use the insulin it generates. As a result, the disease increase the risk of malfunction of different organs, especially the eyes, kidneys, nerves, heart, and blood vessels [1]. According to the report of World Health Organization (WHO) diabetes will be the seventh prominent cause of death by 2030 [2]. About 642 million adults (1 in 10 adults) are projected to have diabetes in 2040 [3]. The deaths of around 1.6 million people were completely affected by diabetes in 2015 and 2.2 million deaths due to high blood glucose in 2012 [4]. Diabetes Mellitus do not depend on the age, it can happen with people anytime. There are three types of diabetes [4]: i) Juvenile or childhood diabetes (type 1 diabetes), ii) Type 2 or adult diabetes iii) Gestational or type 3 diabetes. Gestational diabetes is hyperglycemia which occurs because of the change in hormones during pregnancy. Generally, type 1 diabetes happens due to the lack of insulin production and it is diagnosed in people of young age [4]. Type 2 is a very familiar form of diabetes, and it contains a huge volume of people from around the world [5]. Type 2 mostly causes surplus body weight and physical disuse. Whatsoever, type 1 and type 2 diabetes cannot be cured properly. But, early diagnosis and simple lifestyle can prevent it. Moreover, there are different new cases of diabetes arises from the developing countries [5] where shocking amounts of diabetes affected people are from Bangladesh which is projected to climb up to more than 16 million by 2020 [6]. In last few decades, data has been elevated in a vast scale in diverse arenas [7] [8] including medical fields. Machine Learning is a discipline that aims to solve different important biomedical problems [9]. The machine learning based classification techniques are the most operative methods for both real-life and scientific problems [10]. The use of these classification based approaches in the diagnosis and cure of diseases can significantly decrease medical errors and human costs. As described in the study [11], machine learning based classification techniques have prospective performance in prediction accuracy as compared to other algorithms for data classification. Data classification accuracy may vary conditionally on different machine learning techniques. Many of the researchers have been focused on diabetes from various perspectives of their works where most of the study discussed the classification techniques for diabetes prediction and its accuracy [11-13]. However, the author’s acknowledgement, no SAMPLE: Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Conference’18, Month 8, 2018, Chengdu, Sichuan, China. Copyright 2018 ACM 1-58113-000-0/00/0010 …$15.00. DOI: http://dx.doi.org/