I.J. Information Technology and Computer Science, 2016, 11, 26-32 Published Online November 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijitcs.2016.11.04 Copyright © 2016 MECS I.J. Information Technology and Computer Science, 2016, 11, 26-32 A Tool for Diabetes Prediction and Monitoring Using Data Mining Technique S. R. Priyanka Shetty Nitte Meenakshi Institute of Technology/Department of CSE, Bangalore, 560064, India E-mail: siddamshettypriya@gmail.com Sujata Joshi Nitte Meenakshi Institute of Technology/Department of CSE, Bangalore, 560064, India E-mail: sujata_msrp@yahoo.com AbstractData mining is the process of analyzing different aspects of data and aggregating it into useful information. Classification is a data mining task generally used in medical data mining. The goal here is to discover new and useful patterns to provide meaningful and useful information for the users about the diabetes. Here a diabetes prediction and monitoring system is designed and implemented using ID3 classification algorithm. The symptoms causing diabetes are identified and are applied to the prediction model based on which the prediction is done. The monitoring module analyzes the laboratory test reports of the blood sugar levels of the patient and provides proper awareness messages to the patient through mail and bar chart. Index TermsData mining, Classification, Decision tree, ID3, Diabetes dataset, Prediction. I. INTRODUCTION A. Data mining Data mining is the process of extracting hidden knowledge from large volumes of raw data. It is the analytical process designed to explore data in search of consistent patterns and find systematic relationships between variables. The application areas of data mining are in field of education system, market basket analysis, customer relationship management, banking application, sports and in Health care system. In recent years medical data mining has become prominent, since there is enormous amount of medical data available which can be used for discovering useful patterns. The data mining techniques such as classification, clustering, association, outlier analysis help in finding useful patters from the huge amount of medical data. Data mining has great potential for the healthcare industry since it helps health systems to use medical data for analysis and to offer improved healthcare at reduced cost. The data mining techniques when applied to health care play a significant role in prediction and diagnosis of various health problems like heart disease, diabetes, cancer, skin disease and many more. B. Classification Data mining includes classification as one of the fundamental task. Classification is used to predict the group membership of data instance. Classification is applied in areas such as weather prediction, medical diagnosing, scientific experiments etc. The classification technique is generally used in medical data mining. The classification techniques generally used are Decision trees, Bayesian classifier, Random Forest, Random tree, classification by back- propagation and rule based classifiers. Classification is performed in two steps: Model construction: In this step the prediction model is built using appropriate algorithm. Model Usage: In this step the prediction model is applied to actual data and prediction is done accordingly. C. Decision Tree Decision tree is a commonly used technique in data mining which is used for classification. The decision tree classifier is built in a top-down manner with root node and involves partitioning the data into subsets that contains instance with similar values. The decision analysis helps to visualize and explicitly represent decisions and the classification tree helps in decision making. This algorithm creates a model that predicts the value of a target variables based on several input variables. The decision tree applications in the real-world are found in field of medical, agriculture, financial analysis, biometric engineering, plant disease and software development. The commonly used algorithms using Decision tree are ID3, C4.5 and CART. The decision tree algorithm is used widely as it is simple to understand and it can handle both numeric and categorical data. It is robust as well and performs well with large dataset. D. Diabetes Diabetes mellitus (DM) is a chronic disease, in which the person has high blood sugar levels. It affects the ability of body to use the energy found in food for life long. Once the body absorbs simple sugar (sucrose) it