Heatmap Template Generation for COVID-19 Biomarker Detection in Chest X-rays Mirtha Lucas College of Computing and Digital Media DePaul University Chicago, United States mlucas3@mail.depaul.edu Miguel Lerma Department of Mathematics Northwestern University Evanston, United States mlerma@math.northwestern.edu Jacob Furst College of Computing and Digital Media DePaul University Chicago, United States jfurst@cdm.depaul.edu Daniela Raicu College of Computing and Digital Media DePaul University Chicago, United States draicu@cdm.depaul.edu Abstract—Detecting and identifying patterns in chest X-ray images of Covid-19 patients are important tasks for understand- ing the disease and for making differential diagnosis. Given the relatively small number of available Covid-19 X-ray images and the need to make progress in understanding the disease, we propose a transfer learning technique applied to a pretrained VGG19 neural network to build a deep convolutional model capable of detecting four possible conditions: normal (healthy), bacteria, virus (not Covid-19), and Covid-19. The transformation of the multi-class deep learning output into binary outputs and the detection of Covid-19 image patterns using Grad-CAM technique show promising results. The discovered patterns are consistent across images from a given class of disease and constitute explanations of how the deep learning model makes classification decisions. In the long run, the identified patterns can serve as biomarkers for a given disease in chest X-ray images. Index Terms—Neural Networks, Biomarkers, Covid-19, Artificial Intelligence I. I NTRODUCTION Covid-19 is a new acute disease that can be deadly, with an estimated 2% case fatality rate [19]. Early diagnosis may be beneficial for timely decisions about the course of action to take in each case. Medical imaging plays an important role in the process of detection and diagnosis. Computer-aided Diagnosis (CAD) systems may serve as a second opinion in complementing a physician’s assessment [9]. Artificial Intelligence (AI) algorithms have shown great progress in pattern recognition tasks, and in particular for med- ical image analysis. During the last few years there has been a fast development of deep learning models for classification of images. These models have been embedded in state-of-the-art systems to detect Covid-19 from medical images, particularly chest X-rays. However, even these CAD systems present high prediction performance, many of them lack the transparency of showing how the results were produced and thus, they deepen the physicians’ lack of trust in CAD [10]. Therefore, some kind of explanation of what the prediction is based on may allow the physicians to confirm, using their advanced domain knowledge, whether the prediction is likely to be correct. For example, for medical imaging, an explanation can come in the form of showing what area of the image has the largest impact in the outcome of the model. Given that the Covid-19 pandemic appeared very recently, the available data from Covid-19 patients is limited compared to that of other diseases. A useful technique to develop models that work with small datasets is transfer learning. This technique consists of first training a model to classify samples from a large dataset. At the end of the initial training the model is assumed to have captured in its first layers the low- level features of the samples in the dataset, while high-level features leading to the final classification are captured in layers closer to its output. By freezing the first layers of the model and retraining only its last layers on the new, possibly smaller dataset, it is expected that the model will be able to capture the high-level features needed to perform classification of the samples of the new dataset. Here we propose a transfer learning technique to develop a model able of detecting four possible conditions from chest X-ray images: normal (healthy), bacteria, virus (not Covid- 19), and Covid-19. Furthermore, we work in the problem of explainability, i.e., how the model has arrived at the predic- tion. To that end we use the state-of-the-art Gradient Class Activation Map (Grad-CAM) technique described in [15] to identify the location of biomarkers, i.e, measurable indicators of the medical condition. Grad-CAM is able to determine which areas of an input image have the largest impact in each of the possible outputs of the network. Grad-CAM and related techniques have been used exten- sively to locate which areas of an image contain some detected elements; for instance, in an image containing a dog and a cat, Grad-CAM is able to highlight the areas of the image where each of them appear. In the case of Chest X-rays used to detect a disease such as “Covid-19” the biomarkers may be 438 2020 IEEE 20th International Conference on BioInformatics and BioEngineering (BIBE) © IEEE 2020. This article is free to access and download, along with rights for full text and data mining, re-use and analysis DOI 10.1109/BIBE50027.2020.00077