DOI: https://doi.org/10.53350/pjmhs2022161177 ORIGINAL ARTICLE P J M H S Vol. 16, No. 11, November, 2022 77 Prediction of the Outcome of Pakistani Heart Failure Patients by Various Supervised Machine Learning Methods SANA SAEED 1 , MAHAM FAHEEM 2 , KANWAL SALEEM 3 , NIMRA ISHAQ 4 1 Assistant Professor, College of Statistical and Actuarial Sciences, University of the Punjab 2 Lecturer, College of Statistical and Actuarial Sciences, University of the Punjab 3,4 College of Statistical and Actuarial Sciences, University of the Punjab Correspondence to Sana Saeed, Email: sana.stat@pu.edu.pk ABSTRACT Aim: To foresee the outcome of heart failure(HF) in Pakistani patients with potential predictors and through various machine learning (ML) methods. Study design:The secondary data of Pakistani patients is taken from the UCI repository in which a cross-sectional, analytical study was planned. Place and duration: This data was collected in April-December, 2015 at the Institute of Cardiology and Allied hospital Faisalabad-Pakistan. Methodology: The data set consisted of299 patients distributed among male (194) and female patients (105). Ages, serum sodium (SS), serum creatinine (SC), gender, smoking, high blood pressure (HBP), ejection fraction (EF), anemia, platelets, Creatinine Phosphokinase (CPK), and diabetes were considered as the potential predictors for predicting the outcome of HF.The data set was analyzed with the help of various machine learning (ML) predictive models including Logistic regression (LR), K- nearest neighbor (KNN), and Decision trees (DT). Results: The ages of the patients were within 60.833±11.894 years. Out of 299 patients, 129 were anemic, 105 had high blood pressure (HBP), and 96 had a smoking history. A statistical model was estimated by applying LR which assisted us in identifying the significant predictors. The sensitivity of the LRwas observed to be 92.1%, whereas 85.6% of the outcome of HF patients was correctly predicted by this model (LR) and DT achieved89.6% prediction accuracy. Conclusion:Since HF is a substantial reason for deaths in Pakistan. Therefore, the identification of its potential risk factors and its accurate prediction by some modern tools are highly demanded. This study applied ML tools for the said task and concluded that among all the fitted ML models, DT predicted the correct outcome for HF patients proficiently. Keywords: Heart failure, machine learning, logistic regression, k-nearest neighbor, decision trees INTRODUCTION Heart attacks, strokes, and heart failure (HF) are types of cardiovascular diseases (CADs) 1 . HF/congestive HFis a condition that occurs when the muscles in the heart cannot pump adequate blood. Shortness of breath occurs during this condition due to filling the fluid in the lungs of the body. HF is not a condition when the heart completely stops working;rather, it is a condition of developing stiffness and thickness in it 2 . Mortality due to this ailment is approximately 25% in advanced nations and 80% in emerging nations 3 . People are extremely prone to CADs in the subcontinent, Asia which causesmany deaths 4 . However, in emerging nations, females are more at riskthan males 5 .Several factors can add to the complications of HF such as aging, smoking, high blood pressure, etc.Research guided that certain characteristics of human beingssuch as gender, age, and spousal relationship may be allied with a higher risk for CHD 6 . The interaction of these changeable measures forcefully increases the hazard of HF with negative outcomes. In the UK, around £980 million per year is consumed on the administration of HF and the World Bank assessed that it cost globally $108 billion/per annum 7 . HF is assumed to be an ailment of senior persons. Though, the latest studies have specified that the HF strain among adults may be growing 8 . Thestarring role of behavioral hazardsin the expansion of heart diseases is also perceptible.For example, cigarette smoking, diabetes, hyperlipidemia, and hypertension played a vital part in the development of heart ailments 9 . The medical history of patients also played a momentous part in predicting the outcome of HF. Correspondingly, this medical ailment psychology affects patients if they developed HF. Bivol and Grib 2019 suggested that patients with renal damage faced this ailment differently than controls 10 . This could originatesadness, anxiety, and concentratedliveliness related to heart disorders and renal dysfunction as well. Clinically, the HF can be segmented into two groups depending on the EF value 1115 . ----------------------------------------------------------------------------------------- Received on 25-06-2022 Accepted on 15-10-2022 In most developing nations, low education, unemployment, and many other factors contribute to low quality of life. Hence, certain ailments including heart diseases are much more common among aged patients with poor mental health 11 . Pakistan Demographic survey publicized that cardiac ailments comprising heart attacks and HF alone were liable for 14.74 or 221,100 expiries in the country whereas strokes caused deaths in 6.45% or 96,750 population.However, according to WHO,29% of the entire expiries happened due to cardiovascular disease in Pakistan which is comprised of both heart diseases and strokes. WHO also discussed that the prevailing condition in Pakistan revealed a serious fact that preventable diseases are now causing more deaths than infectious diseases like Covid-19, pneumonia, and others 16 . A study conducted in Pakistan 17 exhibited that the wholeincidencelevel of cardiac disease was 6.2% and the elder women who were older than 30 years of age had a considerably increased hazard of heart attack than men. Also, the occurrence of stroke among women was higher than among men. These findings recommended that the prevalence of heart disease was higher in women than men in the country. Quantitative analysis of HF data is done by numerous methods. Amongst them, the ML procedures are in foremost top positions because of the attractiveness and effectiveness of these approaches. Knowing the significant role of HF in causing deaths nationwide, the accurate prediction of HF is highly looked-for by considering the significant predictors. Hence, significant predictors are first screened then the prediction will be done by ML methods comprised of LR, KNN, and DT. METHODOLOGY The data set for this study is tied up from UCI ML Source 18 .