FOREX Publication International Journal of Business & Management Research (IJBMR) Multi-disciplinary | Open Access | ISSN: 2347-4696 __________________________________________________________________________________________________________________ Date of Publication | ID: | DOI: 1 Detecting Cardiovascular Disease using NBTree Algorithm Oladunmoye EO 1* and Kemi Akute 2 1 Department of Guidance and Counselling, University of Ibadan, Nigeria 2 Department of Computer Science, University of Ibadan, Nigeria *Corresponding Author: Oladunmoye EO, Department of Guidance and Counseling, University of Ibadan, Nigeria, E-mail: oladunmoyetomenoch@gmail.com Abstract Empirical report has confirmed cardiovascular disease as the highest cause of mortality rate in the world. In an attempt to avert this, researchers have adopted various data mining algorithms to help health care professionals in the detection of heart disease. Decision Tree (C4.5) is one of the successful data mining techniques used due to it classification power. Also other discretization techniques, voting method, and reduced error pruning are known to produce more accurate Decision Trees. More so Naïve bayes has also been frequently used because of its predictive power. However this research adopted a hybrid of decision tree and naïve bayes to enhance a more accurate and time efficient diagnoses of cardiovascular disease. A widely used benchmark data set issued in this research. To evaluate the performance of the hybrid model the sensitivity, specificity, and accuracy are calculated. The research proposes a model that outperforms C4.5 and naïve bayes in the diagnosis of cardiovascular disease among patients. Keywords: Data Mining; Hybrid; Naïve bayes; Decision Tree; Discretization; Cardiovascular Disease Introduction Heart disease is the leading cause of death in the world over the past 10 years [1]. Heart disease is nothing but the class of diseases that involve the heart or blood vessels (arteries and veins). Today most countries are faced with growing rates of heart disease which is fast becoming become a leading cause of debilitation and death worldwide in men and women over age sixty-five and today in many countries heart disease is viewed as a "second epidemic," replacing infectious diseases as the leading cause of death [2]. Most countries face high and increasing rates of heart disease or Cardiovascular Disease. Even though, modern medicine is generating huge amount of data every day, little has been done to use this available data to solve the challenges that face a successful interpretation of heart disease examination results. Cardiovascular disease (CVD) is a major health problem across the world. It is estimated that by 2030, deaths from CVD will rise from 17.5 million to 23.4 million, an approximate 37% increase from 2004 rates [3]. Given the increasing burden of CVD globally and in particular in developing countries [1], it seems sensible to focus on preventing the development of risk factors among black adolescents in developing countries. This is particularly important because many developing countries are still battling with infectious diseases and HIV [4] and cannot afford a wide spread of CVD. Moreover effective intervention in acute care for CVD is not readily available and affordable in developing countries [5]. Failure to prevent CVD risk factors among adolescents promoted by urbanization and adopted westernized lifestyle [6] may result in a future adult CVD epidemic in developing countries, which will mirror the current situation in many developed countries [7]. Motivated by the world-wide increasing mortality of heart disease patients each year and the availability of huge amount of patients’ data from which to extract useful knowledge, researchers have been using data mining techniques to help health care professionals in the diagnosis of heart disease [8]. In previous times, many hospital information systems are designed to support patient billing, inventory management and generation of simple statistics. Some hospitals use decision support systems, but are largely limited. They can answer simple queries like “What is the average age of patients who have heart disease?” , “How many surgeries had resulted in hospital stays longer than 10 days?”, “Identify the female patients who are single, above 30 years old, and who have been treated for cancer.” However they cannot answer complex queries like “Given patient records, predict the probability of patients getting a heart disease. ” Clinical decisions are often made based on doctors’ intui tion and experience rather than on the knowledge rich data hidden in the database [9]. This practice leads to unwanted biases, errors and excessive medical costs which affects the quality of service provided to patients. The proposed system that integration of clinical decision support with computer-based patient records could reduce medical errors, enhance patient safety, decrease unwanted practice variation, and improve patient outcome [10]. So, data mining are used to generate knowledge rich data which improves the quality of clinical decisions. Knowledge of the risk factors associated with heart disease helps health care professionals to identify patients at high risk of having heart disease. Statistical analysis has identified the risk factors associated with heart disease to be age, blood pressure, smoking habit [11], total cholesterol [9], diabetes [12], and hypertension, family history of heart disease [13], obesity, and lack of physical activity [14]. Data mining in healthcare is an emerging field of high importance for providing prognosis and a deeper understanding of medical data. Data mining applications in healthcare include analysis of health care centers for better health policy-making and prevention of hospital errors, early detection, prevention of