(IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 14, No. 10, 2023 597 | Page www.ijacsa.thesai.org A Comprehensive Comparative Study of Machine Learning Methods for Chronic Kidney Disease Classification: Decision Tree, Support Vector Machine, and Naive Bayes Admi Syarif, Olivia Desti Riana, Dewi Asiah Shofiana, Akmal Junaidi Department of Computer Science-Faculty of Mathematics and Natural Sciences, University of Lampung, Indonesia Abstract—Based on the findings of the 2010 Global Burden of Disease analysis, there was an increase in the global ranking of Chronic Kidney Disease (CKD) as a major contributor to mortality, moving from 27th place in 1990 to 18th position. Approximately 10 percent of the global population experiences CKD, and every year millions of lives are lost due to limited access to adequate treatment. CKD poses a substantial global health concern, greatly affecting both the well-being and life span of individuals afflicted by the condition. This study aims to evaluate the performance of three major classification algorithms in CKD diagnosis: Decision Tree, Support Vector Machine (SVM), and Naïve Bayes. This research distinguishes it from previous studies through an innovative data processing approach. Data preprocessing involved transforming categorical values into numerical form using label encoding, as well as applying Exploratory Data Analysis (EDA) to identify outliers and test data assumptions. In addition, the handling of missing values was done with appropriate strategies to maintain the integrity of the dataset. The classification method was evaluated using a dataset of 400 samples from Kaggle with 24 attributes. Through careful experimentation, the accuracy results of each algorithm are presented and compared. The results of this study can help in the development of a more efficient and accurate decision support system for the early diagnosis of CKD. Keywords—Chronic kidney disease (CKD); classification; decision tree; machine learning; naïve bayes; support vector machine (SVM) I. INTRODUCTION The kidneys, a pair of bean-shaped organs, are located in the posterior part of the abdomen and play a major role in maintaining the body's internal balance. Their duties include filtering and purifying the blood, eliminating excess fluid and metabolic waste through the formation of urine, as well as regulating electrolyte balance, blood pressure, and the production of hormones that influence the formation of red blood cells. The central role of the kidneys in maintaining body harmony also supports the optimal performance of other organs [1]. Currently, the prevalence of CKD continues to increase globally and has become a serious health problem. Based on the Global Burden of Disease study in 2010, CKD rose to 18th as the world's leading cause of death, up from 27th in 1990. More than two million individuals worldwide undergo dialysis therapy or kidney transplantation, although this number represents only about 10% of the population requiring such treatment. About ten percent of the global population suffers from CKD, and millions of lives are lost each year due to limited access to adequate treatment [2]. Chronic Kidney Disease (CKD) refers to the decline in kidney function that occurs slowly over months or even years [3]. Decreased kidney function can result in the accumulation of fluids, electrolytes, and metabolic waste in the body, which in turn causes various health problems. In the early stages, CKD often does not cause noticeable symptoms, but patients may experience kidney pain when the disease is in an advanced stage [4]. Chronic kidney failure is progressive and cannot be cured, resulting in a high mortality rate. One of the problems faced by patients with CKD is the high cost of treatment and medication. Therefore, early detection is crucial to identify kidney disease at an early stage and prevent the development of chronic kidney disorders [5]. In the present era, the use of machine learning has become popular in the field of healthcare due to the demand for efficient analytical methodologies to uncover important yet undiscovered information in health data [6]. Medical data mining is employed to gain insights by reviewing information obtained from medical reports, evidence tables, flowcharts, research papers, and more. This data is then transformed into relevant information to support decision-making [7]. Machine learning is a field that encompasses the creation of statistical models and algorithms, empowering computer systems to execute tasks without direct commands, instead of relying on patterns and deduction. By using machine learning algorithms, computer systems can process large amounts of historical data and recognize patterns within that data. This allows the system to make more accurate predictions based on input data. In this research, three machine learning classification methods are employed, specifically Decision Tree, Support Vector Machine, and Naïve Bayes. The difference from previous studies lies in the preprocessing stage, where several processing techniques are applied to the dataset. One of them is data transformation, where invalid values in categorical data are replaced and categorical values are converted to integers using label encoding. Furthermore, Exploratory Data Analysis (EDA) is conducted, employing descriptive statistics and visual tools to gain a deeper understanding of the data. The