Greg, ML: Automatic Diagnostic Suggestions Humanity is Overrated. Or not. Paola Lapadula 1 , Giansalvatore Mecca 1 , Donatello Santoro 1 , Luisa Solimando 2 , and Enzo Veltri 2 1 Università della Basilicata – Potenza, Italy 2 Svelto! Big Data Cleaning and Analytics – Potenza, Italy (Discussion Paper) Abstract. Recently machine-learning techniques have been applied in a variety of fields. One of the most promising and challenging is handling medical records. In this paper we present Greg, ML, a machine-learning tool for generating auto- matic diagnostic suggestions based on patient profiles. At the core of our system there are two machine learning classifiers: a natural-language module that han- dles reports of instrumental exams, and a profile classifier that outputs diagnostic suggestions to the doctor. After discussing the architecture we present some ex- perimental results based on the working prototype we have developed. Finally, we examine challenges and opportunities related to the use of this kind of tools in medicine, and some important lessons learned developing the tool. In this respect, despite the ironic title of this paper, we underline that Greg should be conceived primarily as a support for expert doctors in their diagnostic decisions, and can hardly replace humans in their judgment. 1 Introduction The larger availability of digital data related to all sectors of our everyday lives has cre- ated opportunities for data-based applications that would not be conceivable a few years ago. One example is medicine: the push for the widespread adoption of electronic med- ical records [9, 5] and digital medical reports is paving the ground for new applications based on these data. Greg, ML [8] is one of these applications. It is a machine-learning tool for generat- ing automatic diagnostic suggestions based on patient profiles. In essence, Greg takes as input a digital profile of a patient, and suggests one or more diagnosis that, according to its internal models, fit the profile with a given probability. We assume that a doctor inspects these diagnostic suggestions, and takes informed actions about the patients. We notice that the idea of using machine learning for the purpose of examining medical data is not new [7, 11, 10]. In fact, several efforts have been taken in this di- rection [1,6]. To the best of our knowledge, however, all of the existing tools concen- trate on rather specific learning tasks, for example identifying a single pathology – like Copyright © 2019 for the individual papers by the papers’ authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors. SEBD 2019, June 16-19, 2019, Castiglione della Pescaia, Italy.