Jayshree Pawar et al., International Journal of Emerging Trends in Engineering Research, 9(8), August 2021, 1051 1056 1051 ABSTRACT Corona Virus Disease of 2019 (COVID-19) has emerged as a serious health emergency worldwide. The symptoms of COVID-19 are un-detectable at early stage in most of the patients. It spreads from person to person very rapidly and causes severe sickness and loss of life in a number of cases if not treated early. Data mining techniques are very commonly being used in medical sector for detection and prediction of a variety of diseases and medical conditions of patients. A number of researchers are also working towards prediction of possibility of infection of COVID-19 among humans using machine learning techniques, specifically by applying data mining methods. In this paper, an extensive survey of available literature in the domain of prediction of COVID-19 infection and other diseases has been presented. This also includes survey on data mining techniques, models and various datasets. Key words : Data Mining, Machine learning, COVID-19, Prediction, Diagnosis, Feature Selection, Misclassification. 1. INTRODUCTION Coronavirus epidemic has grappled the whole world. Countries on all the continents are fighting to save their citizens from this deadly disease. The World Health Organization revealed the official name of the pneumonia transmitted by this virus as "COVID-19" or "Corona Virus Disease 2019" on February 11, 2020 [1]. Corona virus is an infection transmitted by a novel severe acute respiratory syndrome coronavirus 2 (SARS-Cov-2). The virus spreads rapidly among people in many different ways which is a major concern around the world [2]. Though it spreads primarily through the air [3]. According to a report, World Health Organisation first identified this virus on 31 Dec 2019 in Wuhan city, China. Many of the first cases of COVID-19 were related to Huanan seafood wholesale market, implying that SARS-CoV-2 was spread from animals to humans [4]. This pandemic is a major public health issue that is affecting people all over the world. Also, it is a contagious disease and causes severe sickness and loss of life in a number of cases if not treated in early stages. According to a study in which parametric analysis was carried out, COVID-19 has a growth rate that is roughly twice that of SARS and MERS [5]. Health care workers are working over time for more than past one year to fight against this deadly virus. Every day, the health-care industry is generating massive amount of data about COVID patients and disease. Researchers and physicians across the globe are working together to detect the infection in people at early stages and find a treatment to cure this disease. Some researchers have analysed severity of COVID infection based on specific existing illness condition such as cancer, pneumonia, pregnancy, hypertension etc. Also different types of datasets have been used by different researchers. These include images from chest X-ray, CT-Scans, pathological reports of patients etc. Some researchers are focusing on test methods and attempting to minimise testing workload [6]. Many researchers are using machine learning methods, especially data mining techniques in the healthcare domain. These are used to discover useful information out of a huge amount of data and to present it in an easy-to-understand format for humans. Classification and clustering are among the most common data mining techniques. Disease prediction is very significant application of these techniques. The algorithms of machine learning are very important for the diagnosis of diseases and they have a significant impact in the medical field. Medical data mining is a term used to describe a variety of strategies for discovering valuable patterns that assist in medical diagnosis. This is aimed at improved disease prediction and early diagnosis. This aids in faster and better medical treatment and patient care [16]. Despite the fact that several studies have been conducted on prediction of COVID and other diseases using various machine learning techniques and variety of datasets, very little literature containing a survey of these is available. In this work, therefore, a review of numerous studies conducted on the prediction of COVID and other diseases is presented. Jayshree Pawar 1 , Urjita Thakar 2 1 Research Scholar, Department of Computer Engineering, Shri Govindram Seksaria Institute of Technology and Science, Indore, India, jayhreepawar.1010@gmail.com 2 Professor, Department of Computer Engineering, Shri Govindram Seksaria Institute of Technology and Science, Indore, India, urjita@rediffmail.com A Survey on Data Mining Techniques for COVID Prediction ISSN 2347 - 3983 Volume 9. No. 8, August 2021 International Journal of Emerging Trends in Engineering Research Available Online at http://www.warse.org/IJETER/static/pdf/file/ijeter02982021.pdf https://doi.org/10.30534/ijeter/2021/02982021