Analysis and Prediction of COVID-19 Disease using
Machine Learning
Shabnam Parmar
1
, Rinkle Rani
2
and Nidhi Kalra
3
1-3
Department of Computer Science and Engineering, Thapar Institute of Engineering and Technology, Patiala, India
Email: sparmar_me20@thapar.edu, raggarwal@thapar.edu, nidhi.kalra@thapar.edu
Abstract—In this research, the symptoms and other factors of a patient are utilized to train
machine learning algorithms to predict whether the patient would die from or recover from
(COVID19). It is probable that the coronavirus (COVID19) will create the highly infectious
Coronavirus illness COVID19 (SARSCoV2). By coughing, sneezing, speaking, or inhaling, this
virus can be transmitted from an infected person's lips and nose to small liquid particles. The
size of these tiny atoms' inhalation droplets and aerosols vary, with bigger droplets being larger
than smaller atoms. COVID19 is transferred by inhalation or by touching your eyes, nose, or
mouth with your fingers after meeting a contaminated surface. When a big number of people
are present, the COVID-19 virus can spread rapidly. We will need to examine the COVID19
dataset to see which models are the most accurate in estimating fatality rates for the virus's
most vulnerable victims. Machine learning is used to compute and evaluate the performance of
a variety of prediction models. We have used K-nearest neighbor (KNN), Support Vector
Machine (SVM) classifier, Gaussian naive Bayesian (GNB), Decision tree (DT), and Logistic
regression (LR) for the prediction of death and recovery of symptomatic patients. In this
research, a variety of feature selection and extraction strategies were used, and prediction
accuracy for feature selection methods for the KNN model and feature extraction methods for
the GNB model both reached up to 96 percent. The k-nearest neighbor has performed and
predicted high accuracy of 96% in both feature selection and extraction techniques.
Index Terms— COVID19, Decision Trees, Gaussian Naive Bayesian, KNN, Logistic Regression,
Machine Learning, SVM, PCA, Fast-ICA, K-Fold, Feature Selection, and Extraction.
I. INTRODUCTION
Recently, the globe has seen fast technological advancement, which demonstrates the critical role of progressive
countries. Today, every aspect of society, including education, employment, trade, military, and media, as well
as manufacturing and healthcare, is obsessed with continuing and emerging technological advancements. The
centre may be a critical location for the rapid adoption of new technologies, ranging from diagnosis to accurate
identification and automated analysis of patients. Coronavirus2 (SARSCoV2) source of severe inhaling pollution
and anarchy, in resultant 2019 (COVID19) was identified in humans because of initial cases in the Chinese city
of Wuhan in December 2019 [1].
Machine Learning is a promising categorization technology. In general, machine learning is a useful framework
for inducing an unknown purpose, relationship, or structure between output and input variables. Generally, all
these interactions are extremely difficult to handle using explicit models and machine learning, which is why
they are mostly employed to anticipate or forecast the possible number of confirmed cases and hence the number
Grenze ID: 01.GIJET.9.1.577
© Grenze Scientific Society, 2023
Grenze International Journal of Engineering and Technology, Jan Issue