International Journal of Scientific and Research Publications, Volume 11, Issue 1, January 2021 339 ISSN 2250-3153 This publication is licensed under Creative Commons Attribution CC BY. http://dx.doi.org/10.29322/IJSRP.11.01.2021.p10936 www.ijsrp.org Comparative Analysis of Machine Learning Algorithms for Heart Disease Prediction Isreal Ufumaka * * Computer Science, University of Benin. DOI: 10.29322/IJSRP.11.01.2021.p10936 http://dx.doi.org/10.29322/IJSRP.11.01.2021.p10936 Abstract- Machine learning has become popular today as so many of its algorithms are now commonly used in different data science projects in various industries especially in the health care sector. It is imperative for researchers and medical professionals to be assisted by machine learning methods in early detection of diseases such as heart disease which is one major killer of humans in our world today. Machine learning algorithms are excellent at learning from data, and since healthcare providers generate huge amount of data on a daily basis, these algorithms can thrive in this field. In this research study, a comparative analytical approach was taken in the determination of which algorithm performs better under the given condition. Various experiments were carried out using cross validation of 5 and 10 folds, to ensure that models created can generalize well enough. This study makes use of data from University of California, Irvine (UCI) machine learning database containing 303 instances with 14 attributes. The collected data is scaled using Min-Max normalization technique. Different popular models are built using supervised machine learning classification algorithms on the scaled data such as Support Vector Machine (SVM), K-Nearest Neighbor (KNN), Logistic Regression (LR), Naïve Bayes (NB), Random Forest (RF), and Gradient Boosting ensemble method. These algorithms are also evaluated using standard performance metrics such as precision, recall, and F1-score. From the experiments carried out, it can be concluded that SVM performs better as it out performs the other algorithms. Index Terms- Classification Algorithms, Gradient Boosting, Logistic Regression (LR), Machine Learning, Support Vector Machine (SVM). I. INTRODUCTION he human heart is a vital organ of the human body system. It can be seen as a mechanical device that works by circulating oxygen rich blood to other body organs such as the brain, kidney, lungs, etc. The heart works day and night ensuring that other organs receive their fair share of oxygen rich blood, and a disruption in its activities will affect proper functioning of other organs which could be fatal. Heart disease also known as cardiovascular disease is most times a life threatening medical condition a person suffers from as a result of the inability of the heart to function well enough in its circulatory duties. Some examples of heart diseases are coronary disease, rheumatic disease, and congenital disease to a host of others that plague both the developed and developing world. The World Health Organization (WHO) estimated that heart disease was the top cause of death with 7 million lives lost in 2015 of which greater than 75% of them were in developing countries. This estimate shows that in 2030 about 23.6 million people will die due to heart disease [12]. Various persons engage in unhealthy living routines such as unhealthy diet, smoking of tobacco, heavy drinking, stress and anxiety which can lead to the development of a heart disease. Early detection is the key to reducing the risk of a heart disease although heart disease has been difficult to diagnose [7]. Better decision making based on the available information gotten from health care providers such as in hospitals and clinics could help improve disease prediction as hospitals and clinics create a large pool of bio medical data. Machine learning provides various computer aided means to predict for the likely hood of a heart disease from heterogeneous medical data. Machine learning is the modern science of getting computers to act without explicitly been programmed. Various machine learning techniques have proven to be useful in the prediction and treatment of diseases such as Alzheimer, Hepatitis, Diabetes, etc. with a high level of accuracy. Machine learning also provides a means of manipulating data in a dire bid to find insight by providing various architectural approach for doing so. Various machine learning algorithms exist that can be used for classification and regression problems or a combination of algorithms like Support Vector Machine (SVM), Naïve Bayes (NB), Decision Tree (DT), Neural Network (NN), etc. T