Personal health indexing based on medical examinations: A data
mining approach
Ling Chen
a,
⁎, Xue Li
a
, Yi Yang
f
, Hanna Kurniawati
a
, Quan Z. Sheng
b
, Hsiao-Yun Hu
c,e
, Nicole Huang
d,c
a
School of Information Technology and Electrical Engineering, The University of Queensland, Brisbane, Australia
b
School of Computer Science, The University of Adelaide, Adelaide, Australia
c
Department of Education and Research, Taipei City Hospital, Taipei, Taiwan
d
Institute of Hospital and Health Care Administration, National Yang-Ming University, Taipei, Taiwan
e
Institute of Public Health and Department of Public Health, National Yang-Ming University, Taipei, Taiwan
f
The Centre for Quantum Computation & Intelligent Systems, University of Technology Sydney, Australia
abstract article info
Article history:
Received 5 February 2015
Received in revised form 23 June 2015
Accepted 29 October 2015
Available online 6 November 2015
Keywords:
Personal health index
Geriatric medical examination
Label uncertainty
Data mining
Feature extraction
We design a method called MyPHI that predicts personal health index (PHI), a new evidence-based health indi-
cator to explore the underlying patterns of a large collection of geriatric medical examination (GME) records
using data mining techniques. We define PHI as a vector of scores, each reflecting the health risk in a particular
disease category. The PHI prediction is formulated as an optimization problem that finds the optimal soft labels
as health scores based on medical records that are infrequent, incomplete, and sparse. Our method is compared
with classification models commonly used in medical applications. The experimental evaluation has demonstrat-
ed the effectiveness of our method based on a real-world GME data set collected from 102,258 participants.
© 2015 Elsevier B.V. All rights reserved.
1. Introduction
Modern societies have experienced dramatic growth in elderly popu-
lation from the beginning of this century. This implies increasing
healthcare needs and government expenditure. For example, the U.S. gov-
ernment spent $414.3 billion in elderly health care in 2011, $100 billion
higher than the inflation-adjusted expenses in 2001 [1]. Annual geriatric
medical examination (GME) is now an integral part of elderly healthcare
for many developed countries. For instance, Australia [2], United Kingdom
[3], and Taiwan [4] have GME programs to periodically monitor health
status of senior residents. However, it is always a difficult task for
healthcare professionals to provide an overall report on personal health
after a comprehensive medical check-up is performed with hundreds
of parameters. Moreover, the richness of GME records, such as correla-
tions amongst test results, their longitudinal progression, and their rela-
tionships to other participants that have similar patterns of health
development, is often left unexplored. In fact, such exploration is manual-
ly impossible, because the complexity of the combined effects grows
exponentially with the growth of the number of different test results,
the available number of longitudinal records, and the total number of
participants.
We design a method called MyPHI that predicts personal health
index (PHI), a new health indicator to explore underlying patterns of
a large collection of GME records using data mining techniques. We de-
fine PHI as a vector of scores, each of which is a compliment probability
defined based on the health-related risks associated with a particular
disease category. Since the highest health risk is health-related death,
we explore the health-related main Cause of Death (COD) information
linked to the GME participants. Based on this definition, the higher the
scores, the healthier the person. It is our belief that medical decision
support systems are used to support clinical professionals rather than
to replace them. So the primary goal of the proposed MyPHI is to draw
their attentions to participants with high risks.
To the best of our knowledge, this work is the first of this kind in
predicting personal health scores by mining large medical examination
data. PHI provides an important benchmark for understanding health sta-
tus of the elderly people. Particularly, the following parties can be benefit-
ed by PHI:
• Governments: Public health policies are often made and revised based
on scientific evidence from statistical analysis and research outputs
[5]. For example, community health index can help the understanding
of regional health status [6]. Public health authorities can use PHI to
gauge their decisions on population health policies by utilizing the ag-
gregated PHI of individuals. Particularly, the impact of a policy on
Decision Support Systems 81 (2016) 54–65
⁎ Corresponding author.
E-mail addresses: l.chen5@uq.edu.au (L. Chen), xueli@itee.uq.edu.au (X. Li),
yi.yang@uts.edu.au (Y. Yang), hannakur@uq.edu.au (H. Kurniawati),
qsheng@cs.adelaide.edu.au (Q.Z. Sheng), A3547@tpech.gov.tw (H.-Y. Hu),
syhuang@ym.edu.tw (N. Huang).
http://dx.doi.org/10.1016/j.dss.2015.10.008
0167-9236/© 2015 Elsevier B.V. All rights reserved.
Contents lists available at ScienceDirect
Decision Support Systems
journal homepage: www.elsevier.com/locate/dss