Research Article
A Comprehensive Investigation of the Performances of Different
Machine Learning Classifiers with SMOTE-ENN Oversampling
Technique and Hyperparameter Optimization for Imbalanced
Heart Failure Dataset
Mirza Muntasir Nishat,
1
Fahim Faisal ,
1
Ishrak Jahan Ratul,
1
Abdullah Al-Monsur,
1
Abrar Mohammad Ar-Rafi,
1
Sarker Mohammad Nasrullah,
2
Md Taslim Reza,
1
and Md Rezaul Hoque Khan
1
1
Islamic University of Technology, Gazipur, Bangladesh
2
North South University, Dhaka, Bangladesh
Correspondence should be addressed to Fahim Faisal; faisaleee@iut-dhaka.edu
Received 5 September 2021; Revised 17 December 2021; Accepted 26 January 2022; Published 9 March 2022
Academic Editor: Qianchuan Zhao
Copyright©2022MirzaMuntasirNishatetal.isisanopenaccessarticledistributedundertheCreativeCommonsAttribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
Heart failure is a chronic cardiac condition characterized by reduced supply of blood to the body due to impaired contractile
properties of the muscles of the heart. Like any other cardiac disorder, heart failure is a serious ailment limiting the activities and
curtailing the lifespan of the patient, most often resulting in death sooner or later. Detection of survival of patients with heart
failureisthepathtoeffectiveinterventionandgoodprognosisintermsofbothtreatmentandqualityoflifeofthepatient.Machine
learning techniques can be critical in this regard since they can be used to predict the survival of patients with heart failure in
advance, allowing patients to receive appropriate treatment. Hence, six supervised machine learning algorithms have been studied
and applied to analyze a dataset of 299 individuals from the UCI Machine Learning Repository and predict their survivability from
heart failure. ree distinct approaches have been followed using Decision Tree Classifier, Logistic Regression, Gaussian Na¨ ıve
Bayes, Random Forest Classifier, K-Nearest Neighbors, and Support Vector Machine algorithms. Data scaling has been performed
as a preprocessing step utilizing the standard and min–max scaling method. However, grid search cross-validation and random
search cross-validation techniques have been employed to optimize the hyperparameters. Additionally, the synthetic minority
oversampling technique and edited nearest neighbor (SMOTE-ENN) data resampling technique are utilized, and the perfor-
mances of all the approaches have been compared extensively. e experimental results clearly indicate that Random Forest
Classifier (RFC) surpasses all other approaches with a test accuracy of 90% when used in combination with SMOTE-ENN and
standard scaling technique. erefore, this comprehensive investigation portrays a vivid visualization of the applicability and
compatibility of different machine learning algorithms in such an imbalanced dataset and presents the role of the SMOTE-ENN
algorithm and hyperparameter optimization for enhancing the performances of the machine learning algorithms.
1. Introduction
Heart failure (HF) refers to the condition when the heart
cannot pump adequate blood throughout the body. According
to the WHO, it has emerged as one of the most lethal and
debilitating diseases, claiming approximately 18 million lives
each year [1]. Chronic conditions such as weak or damaged
heart muscles result in a decreased ejection fraction, which
eventually results in heart failure. However, it can also cause
severe damage to the body’s other vital organs and can strike
both children and adults. Age, family history, genetics, lifestyle
habits, cardiovascular diseases (CVD), and race or ethnic origin
are the major risk factors for heart failure. It is equally prevalent
in men and women, but women develop it at a later age [2].
Hindawi
Scientific Programming
Volume 2022, Article ID 3649406, 17 pages
https://doi.org/10.1155/2022/3649406