© 2018, AJCSE. All Rights Reserved 27
RESEARCH ARTICLE
Performance Evaluation of Data Mining Algorithm on Electronic Health Record of
Diabetic Patients
Prakash Kuppuswamy
1
, Rajan John
2
, Shanmugasundaram Marappan
3
1
Department of Computer Networks and Engineering, College of Computer Science and Information System,
Jazan University, Saudi Arabia,
2
Department of Computer Science, College of Computer Science and
Information System, Jazan University, Saudi Arabia,
3
Department of Computer Science, College of Computer
Science and Information System, Jazan University, Saudi Arabia
Received on: 10-09-2018; Revised on: 10-10-2018; Accepted on: 10-11-2018
ABSTRACT
Data mining is a process of fnding interesting patterns from large databases. One of the important
application areas of data mining is health-care sector. In health care, it is not only providing useful
information to health-care professionals but it also provides for health insurance companies. Using these
techniques, many diseases can be predicted at an earlier stage which gives betterment life for human
being. In our research work, we have collected an electronic health record database for a disease of
diabetic patients. Every day, the volume of health-care data is increasing. Using data mining techniques
extract the knowledge from this enormous database effciently. Many algorithms are available in data
mining. We have used classifcation algorithms such as One R, Zero R, J48, random forest, and linear
discriminate analysis. The performance evaluation of classifers can be analyzed through confusion
matrix and in terms of precision, recall, and error rate.
Key words: Classifcation algorithms, data mining, decision stump, diabetes, electronic health record,
linear discriminate analysis, One R, J48, Zero R
INTRODUCTION
Health-care industry generates large amounts
of complex data such as patient history, hospital
resources, electronic records, and information
about medical devices. These data serves as a key
resource to process and analyze for knowledge
extraction that enables the decision-making and to
save cost. Research using data mining techniques
have been applied in the diagnosis of various
diseases such as cardiovascular diseases, AIDS,
asthma, and diabetes.
[1]
Diabetes is one of the
major health problems of all over the world.
[2]
Diabetes is a disease that occurs when the insulin
production in the body is inadequate, or the body
is unable to use the produced insulin in a proper
manner; as a result, this leads to high blood
glucose. The body cells break down the food into
glucose, and this glucose needs to be transported
to all the cells of the body. The insulin is the
Address for correspondence:
Prakash Kuppuswamy
E-mail: prakashcnet@gmail.com
hormone that directs the glucose that is produced
by breaking down the food into the body cells.
Any change in the production of insulin leads
to an increase in the blood sugar levels, and this
can lead to damage to the tissues and failure of
the organs
[3]
such as kidney, eye, heart, nerves,
and foot.
[5]
In general, a person is considered to be
suffering from diabetes, when blood sugar levels
are above normal (4.4–6.1 mmol/L).
[3]
Diabetes mellitus is classifed into four broad
categories: Type 1, type 2, gestational diabetes,
and other specifc types. All forms of diabetes
increase the risk of long-term complications.
These typically develop after many years but
maybe the frst symptom in those who have
otherwise not received a diagnosis before that
time.
[2]
Cause of diabetics are not yet entirely
understood; scientist believes that both genetic
factors and environmental triggers are involved
therein.
[4]
Diabetes can be controlled using
different measures such as insulin and diet. For
this, it should be identifed as early as possible and
subsequently provide appropriate treatment. Most
of the classifying, identifying and diagnosing
Available Online at www.ajcse.info
Asian Journal of Computer Science Engineering 2018;3(4):27-34
ISSN 2581 – 3781