Técnicas Avanzadas de Aprendizaje Automático para Optimizar el Pronóstico de la
Diabetes Mellitus: un Examen Detallado de los Datos Hospitalarios
Data and Metadata. 2024; 3:.363
doi: 10.56294/dm2024.363
ORIGINAL
Advanced Ensemble Machine Learning Techniques for Optimizing Diabetes Mellitus
Prognostication: a Detailed Examination of Hospital Data
Najah Al-shanableh
1
, Mazen Alzyoud
1
, Raya Yousef Al-husban
2
, Nail M. Alshanableh
3
, Ashraf
Al-Oun
1
, Mohammad Subhi Al-Batah
4
, Mowafaq Salem Alzboono
4
ABSTRACT
Diabetes is a chronic disease that affects millions of people worldwide. Early diagnosis and effective
management are crucial for reducing its complications. Diabetes is the fourth-highest cause of mortality due
to its association with various comorbidities, including heart disease, nerve damage, blood vessel damage,
and blindness. The potential of machine learning algorithms in predicting Diabetes and related conditions is
significant, and mining diabetes data is an efficient method for extracting new insights.
The primary objective of this study is to develop an enhanced ensemble model to predict Diabetes with
improved accuracy by leveraging various machine learning algorithms.
This study tested several popular machine learning algorithms commonly used in diabetes prediction,
including Naive Bayes (NB), Generalized Linear Model (GLM), Logistic Regression (LR), Fast Large Margin
(FLM), Deep Learning (DL), Decision Tree (DT), Random Forest (RF), Gradient Boosted Trees (GBT), and
Support Vector Machine (SVM). The performance of these algorithms was compared, and two different
ensemble techniques—stacking and voting—were used to build a more accurate predictive model.
The top three algorithms based on accuracy were Deep Learning, Naive Bayes, and Gradient Boosted Trees.
The machine learning algorithms revealed that individuals with Diabetes are significantly affected by the
number of chronic conditions they have, as well as their gender and age. The ensemble models, particularly
the stacking method, provided higher accuracy than individual algorithms. The stacking ensemble model
achieved a slightly better accuracy of 99,94 % compared to 99,34 % for the voting method.
Building an ensemble model significantly increased the accuracy of predicting Diabetes and related
conditions. The stacking ensemble model, in particular, demonstrated superior performance, highlighting
the importance of combining multiple machine learning approaches to enhance predictive accuracy.
Keywords: Machine Learning Algorithms; Ensemble Models; Diabetes Prediction; Data Mining; Predictive
Accuracy; Health Informatics.
RESUMEN
La diabetes es una enfermedad crónica que afecta a millones de personas en todo el mundo. El diagnóstico
temprano y el tratamiento eficaz son cruciales para reducir sus complicaciones. La diabetes es la cuarta
© 2024; Los autores. Este es un artículo en acceso abierto, distribuido bajo los términos de una licencia Creative Commons (https://
creativecommons.org/licenses/by/4.0) que permite el uso, distribución y reproducción en cualquier medio siempre que la obra original
sea correctamente citada
1
Computer Science Department, Al al-Bayt University. Mafraq, Jordan.
2
Faculty of Nursing, Zarqa University. Zarqa, Jordan.
3
Vascular Surgery Unit Jordanian Royal Medical Services.
4
Department of Computer Science, Faculty of Science and Information Technology, Jadara University. 21110 Irbid, Jordan.
Cite as: Al-shanableh N, Alzyoud M, Al-husban RY, Alshanableh NM, Al-Oun A, Al-Batah MS, et al. Advanced Ensemble Machine Learning
Techniques for Optimizing Diabetes Mellitus Prognostication: a Detailed Examination of Hospital Data. Data and Metadata. 2024; 3:.363.
https://doi.org/10.56294/dm2024.363
Submitted: 21-01-2024 Revised: 29-05-2024 Accepted: 09-09-2024 Published: 10-09-2024
Editor: Adrián Alejandro Vitón Castillo
Corresponding Author: Raya Yousef Al-husban