Knowledge-Based Systems 210 (2020) 106455
Contents lists available at ScienceDirect
Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys
A clinical coding recommender system
Mani Suleiman
a,b,*
, Haydar Demirhan
a
, Leanne Boyd
c,d
, Federico Girosi
e,f
,
Vural Aksakalli
a
a
School of Science, Mathematical Sciences, RMIT University, Australia
b
Rozetta Institute (formerly Capital Markets Cooperative Research Centre, CMCRC), Australia
c
Cabrini Institute, Australia
d
Eastern Health, Victoria, Australia
e
Western Sydney University, Australia
f
Digital Health CRC, Australia
article info
Article history:
Received 24 March 2020
Received in revised form 3 September 2020
Accepted 4 September 2020
Available online 17 September 2020
Keywords:
Health informatics
Bayesian networks
Clinical coding
Artificial intelligence
Data mining
Recommender systems
abstract
Clinical coding of hospital admissions can erroneously omit diagnosis and procedure codes. A conse-
quence of these omissions is that the condition and treatment of the patient are not fully captured
by the entered codes, which can then also impact hospital revenue. One way to prevent these errors
is through a real-time recommender system which suggests the addition of codes at the point of
coding when it appears they have been omitted. Association analysis uncovers patterns between codes,
forming a basis for coding recommendations. Combining association analysis with manual expert
validation produces more useful recommendations (we refer to this as the expert validated list), but
is labour-intensive. In this study, we propose an approach using Bayesian Networks to determine
the conditional relationships between codes. Performance is evaluated using a testing strategy which
simulates errors through the random removal of codes from episodes of patient care and counts how
many of the removed codes are recommended to coders by each recommender. Performance is also
based on how many recommended codes were not removed (superfluous recommendations) which
we seek to minimise. We develop a recommender system which generates 96% of the number of
correct recommendations produced by the expert validated list, while having 68% fewer superfluous
recommendations. Our proposed methodology provides a high performance recommender while
reducing dependence on labour-intensive effort by clinical coding experts.
© 2020 Elsevier B.V. All rights reserved.
1. Introduction
1.1. Background
Clinical coding staff working for Health Services are responsi-
ble for coding episodes of patient care. This involves the trans-
mission of information from patient medical records into a series
of standardised codes which represent diagnoses and procedures.
These clinical codes are derived from the International Statistical
Classification of Diseases and Related Health Problems, currently
in its 10th revision (ICD-10) published by the World Health
Organisation [1].
*
Corresponding author at: School of Science, Mathematical Sciences, RMIT
University, Australia.
E-mail addresses: mani.suleiman@rmit.edu.au, manisuleiman.ds@gmail.com
(M. Suleiman), haydar.demirhan@rmit.edu.au (H. Demirhan),
lboyd@cabrini.com.au, leanne.boyd@easternhealth.org.au (L. Boyd),
F.Girosi@westernsydney.edu.au (F. Girosi), vural.aksakalli@rmit.edu.au
(V. Aksakalli).
Automated coding, also known as Computer-Assisted Coding
(CAC), is emerging as a new technology in health information
management. CAC processes clinical text from electronic health
records (EHRs) and automatically assigns codes. Studies have
shown that automated coding is not an error-free process and
its performance depends on case complexity [2]. Campbell and
Giadresco [3] found that while CAC technology can improve clin-
ical coding accuracy, human intervention will still be required,
particularly for quality control. Structured health data is still typ-
ically encoded via manual coding. Almost all health providers in
Australia use manual coding from paper medical records. Hence,
our methodology is not designed to perform the role of an au-
tomated coding system. We aim to provide a tool to support the
coding assignment carried out manually by coding professionals
through an analysis of historical patient data.
During the coding process, coders sometimes erroneously and
unintentionally omit codes. This can result, for example, from
incomplete reading of documents in the patient medical records,
such as the discharge summary. Errors can occur due to inexperi-
ence and/or oversights caused by time pressure. According to an
https://doi.org/10.1016/j.knosys.2020.106455
0950-7051/© 2020 Elsevier B.V. All rights reserved.