Acta Scientiarum http://periodicos.uem.br/ojs ISSN on-line: 1807-8664 Doi: 10.4025/actascitechnol.v46i1.64783 COMPUTER SCIENCE Acta Scientiarum. Technology, v. 46, e64783, 2024 The diabacare cloud: predicting diabetes using machine learning Mehtab Alam * , Ihtiram Raza Khan, Mohammad Afshar Alam, Farheen Siddiqui and Safdar Tanweer Department of Computer Science and Engineering, School of Engineering Sciences & Technology, Jamia Hamdard, New Delhi, India. *Author for correspondence. E-mail: mahiealam@gmail.com ABSTRACT. Machine learning (ML) is the buzz all around the technology industry and is illuminating each and every sector of human lives, be it, healthcare, finance, bioinformatics, data science, mechanical engineering, agriculture or even smart cities nowadays. ML consists of supervised and unsupervised techniques. Due to the availability of data in abundance, supervised ML has been the most preferred method in the field of data mining. In this research paper, a publicly available dataset for diabetes detection is tested to understand the efficiency of classification of a number of supervised ML algorithms to find the most accurate model. The dataset consisted of data of 768 persons out of which 500 were control and 268 were patients we found that the Random Forest algorithm outperformed the other 6 classification algorithm. In the first iteration, the Random Forest algorithm reached 78.44% accuracy. The tweaks performed in the paper outclassed the original random forest algorithm with a difference of 1.08% reaching a score of 79.52%. Further, iteration I gave 171 whilst iteration II gave 173 correct predictions out of the total 218 test data. Keywords: machine learning; artificial intelligence; diabetes; ML; AI; random forest. Received on August 25, 2022. Accepted on March 14, 2023. Introduction Diabetes is one of the most common chronic diseases and can sometimes prove to be life-threatening if not treated in time. The prime attribute of the disease is the high level of glucose in the blood of a person. The increased level of glucose is due to some inadequacy in insulin secretion by the pancreas and/or its diminished biological effects (Lonappan et al., 2007). Diabetes, in its extreme, can lead to the death of the patient, but in less severe scenarios it can lead to severe chronic damage and flawed functioning of vital organs such as the blood vessels, heart, eyes, kidneys, and nerves (Krasteva, Panov, Krasteva, Kisselova, & Krastev, 2011; Parveen, Sehar, Bajpai, & Agarwal, 2020). Diabetes is characterized into 2 distinct types, Type 1 Diabetes and type 2 diabetes. In type 1 diabetes, the pancreas produces very little to no insulin which is responsible for helping the blood sugar enter the cells where it is used to produce energy. It is usually diagnosed in young people below the age of 30 years, but it can prosper at any stage of life. Some of the symptoms of type 1 diabetes are an increase in thirst and an increase in the frequency of urination (Dua, Doyle, & Pistikopoulos, 2006). Type 2 diabetes is seen very commonly in middle-aged and elderly people. Obesity, dyslipidemia, hypertension and arteriosclerosis are actively associated with the onset of chronic disease (Islam, Qaraqe, Belhaouari, & Abdul-Ghani, 2020). Type 1 is less common when compared to type 2 diabetes and approximately 5-10% of diabetes patients are type 1. A cure for diabetes is still not found. It can only be controlled and regulated with healthy health habits. Type 1 Diabetes Mellitus (T1DM), is also known as insulin-dependent diabetes mellitus which consists of only 5-10% of all diabetes mellitus cases. T1DM is an autoimmune disorder that results in the deficiency of insulin in the body and in due time developing hyperglycemia. T1DM is also greatly influenced by environmental as well as genetic factors (Banday, Sameer, & Nissar, 2020). While Type 2 Diabetes Mellitus (T2DM), also called non-insulin-dependent diabetes mellitus constitutes around 90-95% of all diabetes mellitus cases. It is characterized by insulin resistance and β-cell dysfunction. T2DM is linked to increasing age, family history of diabetes, physical inactivity, obesity, adoption of modern lifestyles, and with conditions such as hypertension and dyslipidemia (Genuth, Palmer, & Nathan, 2018). ML and Artificial Intelligence (AI) have been around for a while now. ML is a very efficient approach for analyzing data for scientific as well as clinical studies. ML techniques are being used to classify individuals