Machine Learning based COVID-19 Cough Classification Models - A Comparative Analysis Dr.Jayavrinda Vrindavanam Associate Professor, Dept. of ECE Nitte Meenakshi Institute of Technology Bengaluru, India jayavrinda.v@nmit.ac.in Hari Haran Shankar Student,Dept. of ECE Nitte Meenakshi Institute of Technology Bengaluru, India hariharan0120@gmail.com Dr. Raghunandan Srinath Principal Member of Technical Staff, Graylinx Pvt Ltd Bengaluru, India raghunandan.srinath@graylinx.ai Gaurav Nagesh Student, Dept. of ECE Nitte Meenakshi Institute of Technology Bengaluru, India gaurav.v.nagesh@gmail.com Abstract— COVID-19 continues to be a global pandemic and many a technological intervention are already in place for identification of COVID-19 patients. The paper focuses on the contactless detection of COVID-19 patients by analyzing their respective cough audio samples. The paper demonstrates three machine learning classification models and determines the better classifier among these three models. The model has made use of 15 dominant features. The paper has employed a method of selecting features based on ranking different scores derived from the feature selecting algorithms. The initial results will be forming part of a larger project of developing suitable interfaces, as such devices can reduce the stress on frontline workers and provide an efficient way to manage the resources and time of healthcare professionals. The proposed method has been tested on cough audios both COVID-19 positive and healthy individuals, and the results are promising. Keywords— COVID-19, COVID Cough, Cough Detection, Lung Disorders, Cough Sound Analysis, Machine Learning for audio analysis, Machine Learning for COVID-19 detection, Machine Learning for cough detection. I. INT RODUCT ION The COVID-19 pandemic has triggered considerable research interests in the classification of cough audio into that of different types of patients in order to find technological solution for early identification of the disease. The cough audio modeling can provide diagnostic leads by applying various machine learning tools and algorithms with more advanced feature extraction techniques and robust classification models. In the case of the COVID-19, which is found to be extremely contagious, early detection assumes an important role as the affected patients can be quarantined well in advance as a proactive measure. This would, apart from ensuring the containment of the disease also supports the front-line workers who deal with all types of patients from getting infected. Given the requirement of identifying the COVID-19 patients, Machine Learning algorithms are being extensively used in distinguishing between COVID-19 and non COVID-19 patients through the analysis of the cough patterns. Cough is a normal protective reflex which clears the respiratory tract and prevents the entrance of noxious materials into the respiratory system. Dry cough is one of the major symptoms of COVID-19 along with elevated body temperature and hence can be used a medium of detection of the virus. Cough is associated with a characteristic sound and in this paper the characteristics/pattern of COVID-19 cough is identified by performing certain feature extraction techniques. The exercise forms part of a larger student research project of developing applications and platforms that can support audio-based diagnosis of COVID-19 and similar diseases. Among the alternatives to detect COVID-19 patients from their cough patterns, one of the approaches can be to analyze the frequency, duration, and image pattern of the cough waveform. While the other approaches include lab testing of cheek swab, nose swab, and blood test which are tedious in nature and the results take up to 2 days by the advanced Reverse Transcription Polymerase Chain Reaction (RT- CRT) tests and the other tests like Antigen tests can provide the result in a matter of 30 minutes. In order to reduce the strain on the chemical labs and avoid the generation of chemical and toxic waste, implementation of classification algorithm along with capable hardware can processes cough audio, and results can be displayed in a matter of seconds with the help of DSP chipsets and machine learning classification algorithms. In this paper, we present the approach of cough audio processing from patients into frames and analyze the waveforms based on different parameters to classify the audio into that of a COVID-19 patient or a healthy person. In the classification part, we use three, most suitable, different machine learning classification models such as Logistic Regression, Support Vector Machines (SVM) and Random Forest and we provide the first fifteen dominant features to these three classifiers to obtain the result and the most appropriate classifier for real-world implementation of the COVID-19 cough detection algorithm is determined. II. A REVIEW OF THE LITERATURE Since the key focus of the paper is classification of audios relating to coughs, this review attempts to provide an Proceedings of the Fifth International Conference on Computing Methodologies and Communication (ICCMC 2021) IEEE Xplore Part Number: CFP21K25-ART 420 2021 5th International Conference on Computing Methodologies and Communication (ICCMC) | 978-1-6654-0360-3/20/$31.00 ©2021 IEEE | DOI: 10.1109/ICCMC51019.2021.9418358