International Journal of Computer Applications (0975 – 8887) Volume 60– No.12, December 2012 26 Comparative of Data Mining Classification Algorithm (CDMCA) in Diabetes Disease Prediction V.Karthikeyani, PhD. Assistant Professor, Department of Computer Science, Thiruvalluvar Government Arts College, Rasipuram, India I.Parvin Begum, Assistant Professor, Department of Computer Application, Soka Ikeda College of Arts and Science, Chennai-99, Tamilnadu, India, K.Tajudin, Assistant Professor, Department of Computer Science, The New College, Royapettah, Chennai-600014, Tamilnadu, India, I.Shahina Begam Assistant Professor, Department of Computer Science, Ratankanwar Bhawarlal Gothi Jain College for Women, Chennai-600052, India. ABSTRACT Data mining is an iterative development within which evolution is defined by discovery, through either usual or manual methods. In this paper using the data mining concept to CDMCA classifies two types supervised and unsupervised classifications. Here illustrate the classification of supervised data mining algorithms base on diabetes disease dataset. It encompass the diseases plasma glucose at least mentioned value. The research describes algorithmic discussion of C4.5, SVM, K-NN, PNN, BLR, MLR, CRT, CS-CRT, PLS-DA and PLS-LDA. Here used to compare the performance of computing time, precision value and the data evaluated using 10 fold Cross Validation error rate, the error rate focuses True Positive, True Negative, False Positive and False Negative and Accuracy. The outcome CS-CRT algorithm best. The Best results are achieved by using Tanagra tool. Tanagra is data mining matching set. The accuracy is calculate based on addition of true positive and true negative followed by the division of all possibilities. Keywords C4.5, SVM, K-NN, PNN, BLR, MLR, CRT, CS-CRT, PLS- DA, PLS-LDA, Classification based on CT, Precision value, CV error rate and Accuracy. 1. INTRODUCTION The significance and Uses of Data Mining in Medicine despite the differences and clash in approaches, the health sector has more need for data mining today[1][15]. There are quite a lot of arguments that could be sophisticated to support the use of data mining in the health sector (Data overload, early detection and/or avoidance of diseases, Evidence-based medicine and prevention of hospital errors. Non-invasive finding and decision support, Policy-making in public health and additional value for money and price savings). Tanagra is more powerful, it contains some supervised learning but also other paradigms such as clustering, supervised learning, meta supervised learning, feature selection, data visualization supervised learning assessment, statistics, feature selection and construction algorithms. The main reason of Tanagra development is to give researchers and students an easy-to- use data mining software, meeting the requirements to the in attendance norm of the software development in this domain, and allow to examine either real or unreal data. Tanagra can exist measured as a educational tool for knowledge encoding techniques [14]. Data surplus there is a wealth of knowledge to be gained from computerized physical condition records [15] up till now the vast bulk of data stored in these databases makes it exceptionally difficult information. Diabetes is not a newly born disease, it has been with human race from long back but, came to knew about it in 1552 B.C. Since this period, many of Greek as well French physicians had worked on it and made us aware of the nature of disease, organs responsible for it etc. In 1870s, a French physician had discover a link between Diabetes and diet in take, and an idea to prepare individual diet plan. Diabetic diet was formulated with inclusion of milk, oats and other fiber containing foods in 1900-1915. Function of insulin, its nature, along with its use started from 1920 -1923, discovered by Dr. Banting, Prof. Macleod and Dr .Collip, who were awarded the Noble prize. In the decade of 1940, it has been discovered that different organs like kidney and skin are also affected if diabetes is creeping for a long term. The main technical objective in KDD development is to design for Data Mining. In addition to the construction, it is also intended to address the process-related issues. It is assumed that the execution of the Data Mining technology would be dealing out, memory and data demanding task as in opposition to one that require continuous interaction with the database. 2. DATA ANALYSIS The most important methodology use for this paper throughout the analysis of journals and publications in the field of medicine. The explore focused on more recent publications. The data study consists of diabetes dataset. It includes name of the attribute as well as the explanation of the attributes. Indian Council of medical Research–Indian Diabetes (ICMR- INDIAB) study has provides data from three states and one Union Territory, representing nearly 18.1 percent of the nation’s population. When extrapolated from these four units, the conclusion is 62.4 million people live with diabetes in India, and 77.2 million people are on the threshold, with prediabetes. It factored in anthropometric parameters like body weight,BMI (body Mass Index),height and weight limits and also tested fasting blood sugar after glucose load(known diabetes exempted),and cholesterol for all participant.