Please cite this article in press as: Sarker A, et al. Automatic evidence quality prediction to support evidence-based decision making. Artif Intell Med (2015), http://dx.doi.org/10.1016/j.artmed.2015.04.001 ARTICLE IN PRESS G Model ARTMED-1394; No. of Pages 15 Artificial Intelligence in Medicine xxx (2015) xxx–xxx Contents lists available at ScienceDirect Artificial Intelligence in Medicine j o ur na l ho mepage: www.elsevier.com/locate/aiim Automatic evidence quality prediction to support evidence-based decision making Abeed Sarker a, , Diego Mollá a , Cécile Paris b a Department of Computing, Macquarie University, Sydney, NSW 2109, Australia b Commonwealth Scientific and Industrial Research Organisation, Crn Vimiera and Pembroke Roads, Marsfield, NSW 2122, Australia a r t i c l e i n f o Article history: Received 3 July 2014 Received in revised form 31 March 2015 Accepted 15 April 2015 Keywords: Automatic text classification Automatic medical evidence classification Decision support system Medical natural language processing Evidence-based medicine a b s t r a c t Background: Evidence-based medicine practice requires practitioners to obtain the best available medical evidence, and appraise the quality of the evidence when making clinical decisions. Primarily due to the plethora of electronically available data from the medical literature, the manual appraisal of the quality of evidence is a time-consuming process. We present a fully automatic approach for predicting the quality of medical evidence in order to aid practitioners at point-of-care. Methods: Our approach extracts relevant information from medical article abstracts and utilises data from a specialised corpus to apply supervised machine learning for the prediction of the quality grades. Following an in-depth analysis of the usefulness of features (e.g., publication types of articles), they are extracted from the text via rule-based approaches and from the meta-data associated with the articles, and then applied in the supervised classification model. We propose the use of a highly scalable and portable approach using a sequence of high precision classifiers, and introduce a simple evaluation metric called average error distance (AED) that simplifies the comparison of systems. We also perform elaborate human evaluations to compare the performance of our system against human judgments. Results: We test and evaluate our approaches on a publicly available, specialised, annotated corpus con- taining 1132 evidence-based recommendations. Our rule-based approach performs exceptionally well at the automatic extraction of publication types of articles, with F-scores of up to 0.99 for high-quality publication types. For evidence quality classification, our approach obtains an accuracy of 63.84% and an AED of 0.271. The human evaluations show that the performance of our system, in terms of AED and accuracy, is comparable to the performance of humans on the same data. Conclusions: The experiments suggest that our structured text classification framework achieves evaluation results comparable to those of human performance. Our overall classification approach and evaluation technique are also highly portable and can be used for various evidence grading scales. © 2015 Elsevier B.V. All rights reserved. 1. Introduction Evidence-based medicine (EBM) is a practice that requires medical practitioners to obtain the best quality clinical evidence from published research when answering clinical queries, in addi- tion to using their own expertise. It has been described as the conscientious, explicit, and judicious use of current best evidence in making decisions about the care of individual patients[1]. To use the best available medical evidence for solving patients’ Corresponding author at: Department of Biomedical Informatics, Arizona State University, 13212 East Shea Boulevard, Scottsdale, AZ 85259, USA. Tel.: +1 480 884 0349. E-mail address: abeed.sarker@asu.edu (A. Sarker). problems, practitioners are required to perform a number of steps including searching for evidence, selecting the best avail- able evidence, extracting relevant information, and appraising the quality of the extracted evidence in the light of the patients’ problems. Currently, the process of evidence-based answer gen- eration is a manual process and primarily due to the plethora of electronically available medical documents, practitioners gen- erally face the problem of information overload. Research has shown that practitioners often fail to pursue evidence-based answers to their clinical queries, particularly at point-of-care, due to time constraints [2]. The time associated with seek- ing and appraising information is largely considered to be the biggest obstacle in EBM practice [3–10]. As such, approaches that can extract relevant information from medical text, and utilise them to automatically perform some of the tasks associated with http://dx.doi.org/10.1016/j.artmed.2015.04.001 0933-3657/© 2015 Elsevier B.V. All rights reserved.