COVID-19 detection through X-Ray chest images Diego Hernandez Department of Electronics, Telecommunications. and Informatics. University of Aveiro Aveiro, Portugal dc.hernandez@ua.pt Rodrigo Pereira Department of Electronics, Telecommunications. and Informatics. University of Aveiro Aveiro, Portugal rodrigo.pereira@ua.pt Petia Georgevia Department of Electronics, Telecommunications. and Informatics. University of Aveiro Aveiro, Portugal petia@ua.pt II. RELATED WORK SARS-CoV-2 is a new virus and COVID-19 is a new disease, but beyond being a new biological and etiological entity, it is also the first time we deal with a worldwide pandemic in the era of big data. Thus, several calls have been made to the academic community to respond to the COVID-19 pandemic with data science, artificial intelligence and machine learning [2]–[5]. However, the problem of COVID-19 detection through X- Ray chest images is a new one and to the best of our knowledge so far there is no previous work. Though we did not find any related papers, we took an inspiration by the paper [6], where the dataset Chexpert was used to classify multi- labeled X-Ray images applying the ResNet50 Convolutional Neural Network (CNN) architecture. III. DATA AND COMPUTATIONAL RESOURCES A. Data Retrieval Available data about COVID-19 patients is still not sufficient, however, the Italian Society of Medical and Interventional Radiology has made available a limited number of X-Ray images of patients infected with COVID-19 (https://www.sirm.org/category/senza- categoria/covid-19/ ). From 70 different cases, we selected 58 with a frontal perspective, as shown in Fig. 1. The second data source used in this study was the Fig. 1. Dataset 1 samples: X-Ray images of covid19 infected patient (left image) and healthy patient (right image) large data-set of pulmonary X-Rays, named ChexPert (https://stanfordmlgroup.github.io/competitions/chexpert/ ), provided by the University of Stanford. Details about the data are presented in Fig. 2. Abstract—The new COVID-19 virus has proven to be a real threat to the humanity. In this work we propose a machine learning approach to identify cases of infected patients through X-Ray images of their lungs. Due to the scarceness of the available data and limited computational power, we come up with two approaches: i) Build a custom Convolutional Neural Network (CNN) from scratch, with large data set of historical not COVID- 19 pulmonary X-Rays. Tune the final l ayers w ith C OVID-19 X- Ray images; ii) Apply transfer learning through pretrained CNN models (ResNet, VGG, DenseNet) and fine t uning w ith COVID- 19 data. The second approach allowed us to reach around 90% accuracy on this challenging task. Keywords—COVID-19, Transfer Learning, VGG, ResNet, DenseNet. I. I NTRODUCTION Late in 2019, in the city of Whuan (China), was reported the first infection by the n ew Corona Virus (SARS-CoV-2). Since then, the virus has spread around the world, becoming the worst pandemics humanity has faced in this century. Testing and isolating carriers of this virus has proven to be crucial to stop it. The current means to test individuals consist of a Polymerase Chain Reaction (PCR) Throat Swab test, that holds a sensitivity of 99% and a specificity of 9 8%, if preformed correctly [1]. But the testing capability of each country is still a problem. Our hypothesis is that despite the effects of this virus, similar to pneumonia, there might be a differentiating factor in the lungs of the patient. This factor may distinct, even if slightly, from the effects of pneumonia. The objective of the present work is to figure o ut if there are common characteristics between the lungs X-rays images taken from COVID-19 patients, that differentiates themselves, from the X-rays images taken from patients that does not have COVID-19. This is achieved, through deep learning algorithms that attained decent levels of accuracy. The paper is organized as follows. In Section II we present briefly t he r elated w ork t hat we w ere b ased o n. D ata s et and the computational resources are presented in section III. In sections IV and V the COVID-19 detection is considered as binary and multi-class problem, respectively. Finally, in Section V, conclusions are drawn. © IEEE 2021. This article is free to access and download, along with rights for full text and data mining, re-use and analysis.