International Journal of Information Technology (IJIT) Volume 3 Issue 3, May - Jun 2017 ISSN: 2454-5414 www.ijitjournal.org Page 6 Comparison of Classification Algorithms in Diabetic Dataset J.Anitha [1] , Dr.A.Pethalakshmi [2] M.Phil Scholar [1] , Associate professor and Head [2] Department of Computer Science, M.V.Muthiah Government Arts College for Women, Dindigul. Tamil Nadu - India I. INTRODUCTION Data mining:Data mining is the process of discovering interesting patterns and knowledge from large amount of data [3].It is a self-knowledge discovery and a process for the analysis of large dataset providing unknown, hidden, meaningful patterns automatically obtained from large-scale databases [9].A physician has to analyze lot of factors before diagnosing the diabetes which makes physician’s job difficult. Recently, there are many methods and algorithms used mine bio-medical dataset for hidden information including Neural networks (NNs), Decision Tree(DT), Fuzzy logic systems, Naive Bayes, SVM and so on. These algorithms decrease the time spent for processing symptoms and producing diagnoses, making them more precise at the same time. Diabetes:Diabetes is a major health problem in most of the countries. Among all countries, India is in 3 rd place according to this .It is a condition in which your body is unable to produce the required amount of insulin needed to regulate the amount of sugar in the body. Insulin is the principle hormone that regulates uptake of glucose from the blood into most cells (muscle, fat cells).If the amount of insulin available is in-sufficient, and then glucose will not have its usual effect so that glucose will not be absorbed by the body cells that require it. WHO reports state that almost one-third of the women who suffer from diabetes have no knowledge about it [1]. The common symptom, for the diabetic patients are frequent urination, increased thirst, weight loss, slow-healing in wound, giddiness, increased hunger etc. Types of Diabetes Type I:It is called Insulin--dependent diabetes, it usually appears before age of 30, due to lack (or) deficiency of insulin. Majority of these diabetes causes were in children. Persons with type I diabetes, the beta cells of the pancreas, (which are responsible for insulin production), are destroyed due to autoimmune system. Type II:It is called non-Insulin dependent diabetes. It is usually occurs over 40 years of age. The causes of type II diabetes are overweight, obesity, lack of physical activity, poor diet and family history. Gestational Diabetes:It is the 3 rd main form and occurs when pregnant women without a previous history of diabetes develop a high blood glucose level.[7] Diabetes affects human organs such as kidney, eye, heart, nerves, foot, etc…Type I, Type II diabetes can’t be cured, they can be controlled and treated by special diets, regular exercise and insulin injection. The paper is organized as follows: Section II describes the related works. Section III deals with the methodology of two algorithms. Section IV discusses about the results of two algorithms and Section V concludes the paper. II. RELATED WORKS AiswaryaIyer, et al. [1] have employed Decision tree (J48), Naïve Bayes algorithms for predicting diabetes. They used Pima Indian Diabetes dataset; it was implemented using WEKA tool. They found Naive Bayes algorithm gave 79.56% accuracy than another for predicting diabetes.V.AnujaKumari, R.Chitra ,[2] used SVM with Radial Basis Function Kernal for classification of diabetes ABSTRACT Data mining Techniques has proved for early prediction of disease with higher accuracy in order to save human life. Diabetes is one of the most common and rapidly increasing diseases in the world. Diabetes has affected over 246 million people worldwide with the majority of them being women. World Health Organization report (WHO), this number is expected to rise over 380 million by 2025. In this paper two classification algorithms, namely Naive Bayes and J48 are studied and applied on the diabetic dataset. The so- called algorithms are tested using WEKA tool for comparing its accuracy rate, time and error rate. Keywords:- Data mining, Diabetes,Dataset, Naive Bayes, J48. RESEARCH ARTICLE OPEN ACCESS