Science Journal of Public Health 2013; 1(1) : 39-43 Published online March 10, 2013 (http://www.sciencepublishinggroup.com/j/sjph) doi: 10.11648/j.sjph.20130101.16 Application of artificial neural network and binary Logistic regression in detection of Diabetes status Azizur Rahman 1 , Karimon Nesha 2 , Mariam Akter 2 , Md. Sheikh Giash Uddin 1 1 Department of Statistics, Jagannath University, Dhaka-1100, Bangladesh 2 Department of Disaster management, University of Dhaka, 2 School of Business, United International University, Dhaka, Bangladesh Email address: rahman.aziz83@gmail.com (A. Rahman) To cite this article: Azizur Rahman, Karimon Nesha, Mariam Akter, Md. Sheikh Giash Uddin. Application of Artificial Neural Network and Binary Logistic Regression in Detection of Diabetes Status, Science Journal of Public Health, Vol. 1, No. 1, 2013, pp. 39-43. doi: 10.11648/j.sjph.20130101.16 Abstract: Various methods can be applied to build predictive models for the clinical data with binary outcome variables. This research aims to compare and explore the process of constructing common predictive models. Models based on an artificial neural network (the multilayer perceptron) and binary logistic regression were applied and compared in their ability to classifying disease-free subjects and those with diabetes mellitus(DM) diagnosed by glucose level. Demographic, enth- ropometric and clinical data were collected based on a total of 460 participants aged over 30 years from six villages in Bangladesh that were identified as mainly dependent on wells contaminated with arsenic. Out of 460 participants 133 (28.91%) suffered from DM, 116 (25.27%) had impaired glucose tolerance (IGT) and the remainder 211 (45.86%) were disease free. Among other factors, family history of diabetes and arsenic exposure were found as significant risk factors for developing diabetes mellitus (DM), with a higher value of odds ratio. This study shows that, binary logistic regression cor- rectly classified 73.79% of cases with IGT or DM in the training datasets, 70.96% in testing datasets and 70.4% of all subjects. On the other hand, the sensitivities of artificial neural network architecture for training and testing datasets and for all subjects were 83.4%, 82.25% and 84.33% respectively, indicate better performance than binary logistic regression model. Keywords: Artificial Neural Network (ANN), Binary Logistic (LR), Classification, Diabetes Mellitus (DM) 1. Introduction Diabetes mellitus is a heterogeneous syndrome characte- rized by elevated blood glucose level. Most of the causes of diabetes mellitus are still unknown. However, impaired insulin secretion from the pancreas or impaired insulin ac- tion as a result of insulin resistance in the skeletal muscle, liver and adipose tissue has been noted in the diabetic pa- tients [21]. Genetic disposition and environmental factors are important in the development of diabetes mellitus [22]. Recent studies show that environmental factors are impor- tant in the development of diabetes mellitus; among them one of the important environmental factors is arsenic con- tamination of well water. Statistical methods such as discriminant analysis and lo- gistic regression have commonly been used to develop models for clinical diagnosis and treatment [5]. But studies published in recent years have reported that the artificial neural network approach improves prediction in several situations including prognosis of breast cancer in women after surgery [17], modeling for surgical decision-making for patients with traumatic brain injury [5] and survival of alcoholic patients with severe liver disease [16]. In contrast, others have reported that artificial neural networks and sta- tistical models yielded similar results [9, 18]. Artificial intelligence has been proposed as a reasoning tool to support clinical decision-making since the earliest days of computing [3, 4, 5, 6, 7]. Artificial neural network is a computer modeling technique based on the observed be- haviors of biological neurons [8]. This is a non-parametric pattern recognition method which can recognize hidden patterns between independent and dependent variables [9]. The detailed discussion about this approach is introduced in methods and materials section. In Bangladesh, a population of some 30-70 million people living in 41 districts out of the 64 are probably exposed to arsenic from drinking water containing >50mg/L level of arsenic for a long period [19]. The exposure probably started in late 1960s when drilling of tube wells began as part of a wide irrigation plan [20]. In another study, reference [19]