Research Article
Poisson Mixture Regression Models for Heart Disease Prediction
Chipo Mufudza and Hamza Erol
Statistics Department, Cukurova University, 01330 Adana, Turkey
Correspondence should be addressed to Chipo Mufudza; chipmuf@gmail.com
Received 20 April 2016; Accepted 20 October 2016
Academic Editor: David A. Winkler
Copyright © 2016 C. Mufudza and H. Erol. Tis is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
Early heart disease control can be achieved by high disease prediction and diagnosis efciency. Tis paper focuses on the use
of model based clustering techniques to predict and diagnose heart disease via Poisson mixture regression models. Analysis and
application of Poisson mixture regression models is here addressed under two diferent classes: standard and concomitant variable
mixture regression models. Results show that a two-component concomitant variable Poisson mixture regression model predicts
heart disease better than both the standard Poisson mixture regression model and the ordinary general linear Poisson regression
model due to its low Bayesian Information Criteria value. Furthermore, a Zero Infated Poisson Mixture Regression model turned
out to be the best model for heart prediction over all models as it both clusters individuals into high or low risk category and predicts
rate to heart disease componentwise given clusters available. It is deduced that heart disease prediction can be efectively done by
identifying the major risks componentwise using Poisson mixture regression model.
1. Introduction
Heart disease encloses a number of conditions that infuence
the heart and not just heart attacks. Tese may include
functional problems of the heart such as heart-valve abnor-
malities, high blood pressure (BP), smoking, diet, cholesterol,
or irregular heart rhythms. Problems like these can lead to
heart failure, arrhythmia, and a host of other problems. Te
work in [1] claims that heart diseases have become the leading
global death accounting for 17.3 million deaths per year and
killing 1 in 7 in the US alone. Terefore efective and efcient
automated heart disease prediction systems can be benefcial
to both the patient and cardiologist. Although there has been
increasing interests on heart disease problems especially with
the use of data mining techniques and algorithms, most of
them concentrated on a supervised classifcation approach
through diferent classifcations and algorithms.
In a comparative approach research by Bagirov et al. [2]
they showed that it is possible to classify heart disease prob-
lems using either supervised or unsupervised classifcations.
Supervised classifcation on diferent patients status by data
mining algorithms to predict heart disease has been explored
by various authors in machine learning. H. D. Masethe and
M. A. Masethe [3] identifed that the predictive accuracy of
J48, over REPTREE and CART, was reliable for heart disease
prediction in South Africa. Weighted fuzzy models based on
supporting have also been studied and analysed showing an
improvement against the network based models [4]. Super-
vised classifcation algorithms can improve the efciency to
cardiologist as shown by Taneja [5] on PGI data where over
7300 observations were classifed using J48 and Naive Bayes
in WEKA. In a general review on the heart disease using
data mining techniques done by Kaur and Singh [6], they
summarised that most researches show that the main risk fac-
tors are cholesterol, lack of exercise, obesity, and high blood
pressure whilst the best algorithms seem to be dominated by
decision trees.
Supervised classifcations predictions can also be
improved in some cases by incorporating unsupervised
classifcation techniques like clustering as a preprocessing
procedure. However, it may not always be the case as shown
by Soni et al. [7] that decision trees can still outperform the
Bayesian classifcation even when clustering is incorporated
although they mentioned that both algorithms can be
improved by genetic algorithms. Tey also implemented
a combination of associative classifcation and genetic
algorithms in an efort to come up with the best system
Hindawi Publishing Corporation
Computational and Mathematical Methods in Medicine
Volume 2016, Article ID 4083089, 10 pages
http://dx.doi.org/10.1155/2016/4083089