International Journal of Engineering and Advanced Technology (IJEAT) ISSN: 2249 – 8958, Volume-9 Issue-2, December, 2019 1169 Published By: Blue Eyes Intelligence Engineering & Sciences Publication Retrieval Number: B3473129219/2019©BEIESP DOI: 10.35940/ijeat.B3473.129219 Abstract: World Health Organization’s (WHO) report 2018, on diabetes has reported that the number of diabetic cases has increased from one hundred eight million to four hundred twenty-two million from the year 1980. The fact sheet shows that there is a major increase in diabetic cases from 4.7% to 8.5% among adults (18 years of age). Major health hazards caused due to diabetes include kidney function failure, heart disease, blindness, stroke, and lower limb dismembering. This article applies supervised machine learning algorithms on the Pima Indian Diabetic dataset to explore various patterns of risks involved using predictive models. Predictive model construction is based upon supervised machine learning algorithms: Naïve Bayes, Decision Tree, Random Forest, Gradient Boosted Tree, and Tree Ensemble. Further, the analytical patterns about these predictive models have been presented based on various performance parameters which include accuracy, precision, recall, and F-measure. Keywords: Machine Learning, Supervised Learning, Classification, Bio-informatics, Data Mining I. INTRODUCTION Nowadays, diabetes has become one of the most common diseases. Usually, the cases of type 2 diabetes have been reported either in middle age or in old age people. However, in the recent past, various cases of diabetes have also been reported in children. The pancreas is responsible for the production of insulin in our body. Diabetes prevails if the body is unable to use the produced insulin effectively or the pancreas does not produce the required amount of insulin. Therefore, diabetes is considered a major reason for global concern due to severe health hazards which may lead to hyperglycemia [1]. Hyperglycemia is one of the major causes of diabetic retinopathy, cardiac stroke, foot ulcer, nephropathy, and neuropathy. Hence, it has become of utmost important to draw analytics for the early or on-time detection of diabetes to enhance the quality of life and lifetime enhancement of the patients [2-3]. Latest technological developments in the field of engineering and sciences relates to various machine learning applications which include: speech recognition or natural language processing (NLP), computer vision (facial recognition, pattern recognition, character recognition), Google’s Revised Manuscript Received on December 15, 2019. * Correspondence Author Kalpna Guleria, Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India. kalpna@chitkara.edu.in Devendra Prasad*, Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India devendra.prasad@chitkara.edu.in Virender Kadyan, Chitkara University Institute of Engineering and Technology, Chitkara University, Punjab, India. varinder.kadyan@chitkara.edu.in self-driving cars, recommender system’s (Amazon’s product recommendations, Netflix, YouTube), stock market/ housing /finance/ real estate predictions, web search engine optimization, photo tagging, spam classification and biomedical/healthcare sector. Major applications of machine learning in bioinformatics include risk assessment and prediction of cardiac attack, cancer classification, and nephropathic analytics, neuropathic risk assessment [4-5]. Machine learning is a science of experiential learning which draws analytics from past experience and improves the performance of a system through predictive modelling [6]. To draw correct and concise analytics from medical information is the main aim of bioinformatics in medical science. Whereas, a lot of unnecessary tests may complicate the diagnosis process/system and results as well. Hence, machine learning can be used to resolve this difficulty by using various classification algorithms [7]. Machine learning is a branch of Artificial Intelligence that builds up predictive models to draw various statistical analytics. Fig. 1. exhibits various steps to develop a predictive model. Fig. 1. Experiential learning and Predictive Model Building Process The process of learning and predictive model building starts with raw data collection. Data preprocessing focuses on data cleaning (removal of inconsistent and noisy data) and data integration (to combine the different sources of data). Data set may consist of objects whose values do not relate to the other values in the data set or shows the dissimilarity with the general behavioral characteristics of the data. Detection of Diabetic Patterns using Supervised Learning Kalpna Guleria, Devendra Prasad, Virender Kadyan