(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 12, No. 3, 2021 Using Machine Learning Technologies to Classify and Predict Heart Disease Mohammed F. Alrifaie 1 , Zakir Hussain Ahmed 2 , Asaad Shakir Hameed 3 , Modhi Lafta Mutar 4 Computer Engineering Department, Faculty of Engineering, Karabuk University, Karabuk, Turkey 1 Department of Information and Communications, Basra University College of science and technology, Basrah, Iraq 1 Department of Mathematics and Statistics, College of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Kingdom of Saudi Arabia 2 Faculty of Information and Communication Technology Universiti Teknikal Malaysia Melaka Hang Tuah Jaya Durian Tunggal, Melaka, Malaysia 3 Department of Mathematics, General Directorate of Thi-Qar Education, Ministry of education, Thi-Qar, Iraq 3 Faculty of Information and Communication Technology Universiti Teknikal Malaysia Melaka Hang Tuah Jaya Durian Tunggal, Melaka, Malaysia 4 Department of Mathematics, General Directorate of Thi-Qar Education, Ministry of education, Thi-Qar, Iraq 4 Abstract—The techniques of data mining are used widely in the healthcare sector to predict and diagnose various diseases. Diagnosis of heart disease is considered as one of the very important applications of these systems. Data is being collected today in a large amount where people need to rely on the device. In recent years, heart disease has increased excessively and heart disease has become one of the deadliest diseases in many countries. Most data sets often suffer from extreme values that reduce the accuracy percentage in classification. Extreme values are defined in terms of irrelevant or incorrect data, missing values, and the incorrect values of the dataset. Data conversion is another very important way to preconfigure the process of converting data into suitable mining models by acting assembly or assembly and filtering methods such as eliminating duplicate features by using the link and one of the wrap methods, and applying the repeated discrimination feature. This process is performed, dealing with lost values through the "Remove with values" methods and methods of estimating the layer. Classification methods like Naïve Bayes (NB) and Random Forest (RF) are applied to the original datasets and data sets with the feature of selection methods too. All of these operations are implemented on three various sets of heart disease data for the analysis of pre-treatment effect in terms of accuracy. Keywords—Classification; Naive Bayes (NB); (Support Vector Machine SVM); Random Forest; machine learning I. INTRODUCTION Nowadays one of the major causes of death is heart disease at the present time. The heart disease prediction system can support healthcare specialists in predicting heart condition based on the clinical data of patients that has been pre-entered into the system. There are several healthcare manufactures and hospitals which gather massive amounts of data for patients which are hard to deal with current systems [5]. There are a lot of tools that use prediction algorithms are available nonetheless they have several weaknesses [15,16]. Many of the tools cannot deal with large data. Actually, there are a lot of algorithms can be used to find and predict the heart disease such as the discrete differential evolution (DDE) algorithm [17]. Machine learning algorithm acts an important role in extracting hidden knowledge and information and analyzing it from these data sets. Actually, it improves speed and accuracy. Data extraction techniques have been used in many areas, including health care. This paper aims to check whether the prediction of heart disease can be depended on data mining and machine learning [9]. By using some techniques of data mining, Prediction helps detect if a patient suffers of heart disease or not. In addition, the prediction helps specialists to get to the appropriate diagnosis more quickly, not only that, but it increases the accuracy of diagnosis leading to better results may help to reduce or reduce heart attacks at the very least. Hidden relationships can untangle and diseases are diagnosed efficiently by the help of Data mining along with soft computing techniques [7,8]. The datasets are collected and gathered from the Machine Learning Repository (UCI). It now upholds 394 datasets copies with 14 attributes those names are sex, age, chest pain type, resting blood pressure, resting electrocardiographic results, fasting blood sugar>120 mg / dl, serum cholesterol in mg/dl, exercise induced angina, maximum heart rate achieved, the slope of the peak exercise ST segment, oldpeak = ST depression caused by exercise relative to rest, number of main vessels (0-3) colored by flourosopy, thal: 7 = reversible defect; 6 = fixed defect; 3 = normal. These features are used as a service package to the MLC (community of machine learning). There are 3 data bases in the Data Set of heart disease, these data bases namely Cleveland, Hungary, Switzerland. In this paper, we analyze cardiology data based on Dataset by using the link and one of the wrap methods, and applying the repeated discrimination feature. This process is performed, dealing with lost values through the "Remove with values" methods and methods of estimating the layer. However, the outline of this paper as follows, starts from the literatures to analyze the previous studies about classification and the used algorithms in this area. Then we discuss our methodology by elaborating the procedure of the work and the application of the algorithms. In result section we illustrate the obtained results and discuss it. Finally, we summarize our work in conclusion section and future work. 123 | Page www.ijacsa.thesai.org