A Study on Co-occurrence of various Lung Diseases and COVID-19 by observing Chest X-Ray Similarity using Deep Convolutional Neural Networks Sashank Sridhar Department of Computer Science and Engineeering College of Engineering Guindy, Anna University Chennai, India sashank.ssridhar@gmail.com Rahul Seetharaman Department of Computer Science and Engineeering College of Engineering Guindy, Anna University Chennai, India rahulseetharaman@gmail.com Siddartha Mootha Department of Computer Science and Engineeering College of Engineering Guindy, Anna University Chennai, India siddartha.mootha20@gmail.com Dr. Arockia Xavier Annie Rayan Department of Computer Science and Engineeering College of Engineering Guindy, Anna University Chennai, India annie@annauniv.edu AbstractCovid-19, an infectious disease, is currently the leading topic of conversation throughout the world. Declared as a pandemic by the WHO, the virus attacks the respiratory system and causes dry cough, fever and in severe cases difficulty in breathing. In this paper, we analyse the similarity in features between the novel coronavirus 2019 and various other lung diseases such as Pneumonia, Pneumothorax, Atelectasis, Pleural Thickening etc. Chest X-ray scans in the posteroanterior view for various diseases are collected. Convolutional Neural Network using the Residual Network (ResNet) is built to identify the similar regions in the chest X-rays of COVID-19 and various lung diseases. The regions of similarity are visualized using class activation maps. A total of eleven conditions affecting the lungs are studied and compared to COVID-19. The results show that Atelectasis, Consolidation, Emphysema, and Pneumonia are most similar in nature to COVID-19 of the eleven diseases considered. Diseases which our model detects as similar to COVID-19, occur either prior to onset of COVID-19 or as a consequence of COVID-19. Keywords—COVID-19, ResNet, Image Classification, Convolution Neural Networks, Class Activation Maps. I. INTRODUCTION The coronavirus disease 2019 or COVID 19 is an infectious disease of viral origin first seen in Wuhan province, China [1]. Coronavirus generally spreads via droplets and aerosols from one person to another in close proximity via sneezing, coughing or even speaking. It is also found to linger on the surface of inanimate objects and can be transmitted via touching these objects and then your eyes, nose or mouth. The WHO declared it a pandemic in March 2020 [2]. As of June 2020, the virus has spread to 187 countries, with over 8.7 million cases and 460,000 deaths [3]. The virus primarily affects the respiratory system with individuals showing symptoms of fever, dry cough and tiredness. A healthy individual can recover from the virus without any debilitating conditions; however, the problem lies when a patient with an underlying medical condition is affected. These conditions can range from chronic respiratory disease to type 2 diabetes mellitus to cardiovascular disease and even immunocompromised patients [4]. Around 1 out of every 5 © IEEE 2020. This article is free to access and download, along with rights for full text and data mining, re-use and analysis people who get COVID-19 becomes seriously ill and develops difficulty breathing [5]. The coronavirus (COVID-19) is a respiratory illness which attacks the lungs of the human body. There are various diseases that are a sign of onset of COVID-19 and some are a consequence of prognosis of COVID-19. This paper aims to identify the lung diseases that co-occur with COVID-19 by evaluating the percentage of region similarity of X-rays as well as the cosine similarity between COVID-19 and various other lung diseases such as Pneumonia, Fibrosis, Infiltration, Pneumothorax etc. The similarity is found by comparing the chest X-ray scans of COVID-19 patients and the above- mentioned diseases. Diseases, which have lung features that are common in COVID-19 as well, will be having a high percentage of similarity. The diseases with higher percentage of similarity co-occur with COVID-19. Convolutional Neural Networks (CNN) are the most potent form of Artificial Neural Networks (ANN) for pattern recognition in images [6]. They outperform the generic Multilayer Perceptron because they are successful in capturing temporal and spatial dependencies for image classification [7]. There are various types of Convolutional Neural Networks such as LeNet [8], AlexNet [9], VGGNet16 [10], GoogleNet / Inception [11] and ResNets [12]. In this work we incorporate Residual Neural Networks (ResNets) to find the similarity between two diseases. It is observed that ResNet outperforms the other convolutional neural network models in pattern recognition of images [13]. ResNets work towards building a deeper network and finding the right number of optimized layers to avoid the vanishing gradient problem, thereby achieving a boost in the accuracy. Deep Convolutional Neural Networks (DCNN) are used to identify features of lung conditions within chest X-ray images of various diseases and map how similar the features are to those lung conditions in the X-rays of COVID-19 patients. We establish a correlation between the standalone diseases and their onset as a result of COVID-19. We also rank diseases based on highest similarity with COVID-19 to show that they occur concurrently with COVID-19.