International Journal of Engineering and Advanced Technology (IJEAT)
ISSN: 2249 – 8958, Volume-9 Issue-2, December, 2019
1169
Published By:
Blue Eyes Intelligence Engineering
& Sciences Publication
Retrieval Number: B3473129219/2019©BEIESP
DOI: 10.35940/ijeat.B3473.129219
Abstract: World Health Organization’s (WHO) report 2018, on
diabetes has reported that the number of diabetic cases has
increased from one hundred eight million to four hundred
twenty-two million from the year 1980. The fact sheet shows that
there is a major increase in diabetic cases from 4.7% to 8.5%
among adults (18 years of age). Major health hazards caused due
to diabetes include kidney function failure, heart disease,
blindness, stroke, and lower limb dismembering. This article
applies supervised machine learning algorithms on the Pima
Indian Diabetic dataset to explore various patterns of risks
involved using predictive models. Predictive model construction is
based upon supervised machine learning algorithms: Naïve
Bayes, Decision Tree, Random Forest, Gradient Boosted Tree,
and Tree Ensemble. Further, the analytical patterns about these
predictive models have been presented based on various
performance parameters which include accuracy, precision,
recall, and F-measure.
Keywords: Machine Learning, Supervised Learning,
Classification, Bio-informatics, Data Mining
I. INTRODUCTION
Nowadays, diabetes has become one of the most common
diseases. Usually, the cases of type 2 diabetes have been
reported either in middle age or in old age people. However,
in the recent past, various cases of diabetes have also been
reported in children. The pancreas is responsible for the
production of insulin in our body. Diabetes prevails if the
body is unable to use the produced insulin effectively or the
pancreas does not produce the required amount of insulin.
Therefore, diabetes is considered a major reason for global
concern due to severe health hazards which may lead to
hyperglycemia [1]. Hyperglycemia is one of the major causes
of diabetic retinopathy, cardiac stroke, foot ulcer,
nephropathy, and neuropathy. Hence, it has become of utmost
important to draw analytics for the early or on-time detection
of diabetes to enhance the quality of life and lifetime
enhancement of the patients [2-3].
Latest technological developments in the field of engineering
and sciences relates to various machine learning applications
which include: speech recognition or natural language
processing (NLP), computer vision (facial recognition,
pattern recognition, character recognition), Google’s
Revised Manuscript Received on December 15, 2019.
* Correspondence Author
Kalpna Guleria, Chitkara University Institute of Engineering and
Technology, Chitkara University, Punjab, India. kalpna@chitkara.edu.in
Devendra Prasad*, Chitkara University Institute of Engineering and
Technology, Chitkara University, Punjab, India
devendra.prasad@chitkara.edu.in
Virender Kadyan, Chitkara University Institute of Engineering and
Technology, Chitkara University, Punjab, India.
varinder.kadyan@chitkara.edu.in
self-driving cars, recommender system’s (Amazon’s product
recommendations, Netflix, YouTube), stock market/ housing
/finance/ real estate predictions, web search engine
optimization, photo tagging, spam classification and
biomedical/healthcare sector. Major applications of machine
learning in bioinformatics include risk assessment and
prediction of cardiac attack, cancer classification, and
nephropathic analytics, neuropathic risk assessment [4-5].
Machine learning is a science of experiential learning
which draws analytics from past experience and improves the
performance of a system through predictive modelling [6]. To
draw correct and concise analytics from medical information
is the main aim of bioinformatics in medical science.
Whereas, a lot of unnecessary tests may complicate the
diagnosis process/system and results as well. Hence, machine
learning can be used to resolve this difficulty by using various
classification algorithms [7].
Machine learning is a branch of Artificial Intelligence that
builds up predictive models to draw various statistical
analytics. Fig. 1. exhibits various steps to develop a predictive
model.
Fig. 1. Experiential learning and Predictive Model
Building Process
The process of learning and predictive model building starts
with raw data collection. Data preprocessing focuses on data
cleaning (removal of inconsistent and noisy data) and data
integration (to combine the different sources of data). Data set
may consist of objects whose values do not relate to the other
values in the data set or shows the dissimilarity with the
general behavioral characteristics of the data.
Detection of Diabetic Patterns using Supervised
Learning
Kalpna Guleria, Devendra Prasad, Virender Kadyan