information
Article
Early Stage Identification of COVID-19 Patients in Mexico
Using Machine Learning: A Case Study for the Tijuana
General Hospital
Cristián Castillo-Olea
1,
* , Roberto Conte-Galván
1
, Clemente Zuñiga
2
, Alexandra Siono
3
, Angelica Huerta
4
,
Ornela Bardhi
5
and Eric Ortiz
6
Citation: Castillo-Olea, C.;
Conte-Galván, R.; Zuñiga, C.; Siono,
A.; Huerta, A.; Bardhi, O.; Ortiz, E.
Early Stage Identification of
COVID-19 Patients in Mexico Using
Machine Learning: A Case Study for
the Tijuana General Hospital.
Information 2021, 12, 490. https://
doi.org/10.3390/info12120490
Academic Editors: Sidong Liu,
Cristián Castillo Olea and
Shlomo Berkovsky
Received: 13 October 2021
Accepted: 22 November 2021
Published: 24 November 2021
Publisher’s Note: MDPI stays neutral
with regard to jurisdictional claims in
published maps and institutional affil-
iations.
Copyright: © 2021 by the authors.
Licensee MDPI, Basel, Switzerland.
This article is an open access article
distributed under the terms and
conditions of the Creative Commons
Attribution (CC BY) license (https://
creativecommons.org/licenses/by/
4.0/).
1
Ensenada Center for Scientifc Research and Higher Education, Ensenada 22860, Mexico; conte@cicese.mx
2
Tijuana General Hospital, Tijuana 22000, Mexico; drclementezuniga@gmail.com
3
Faculty of Engineering, CETYS University, Mexicali 21259, Mexico; alexandra.siono@cetys.edu.mx
4
Faculty of Medicine and Psychology, Autonomous University of Baja California, Mexicali 21100, Mexico;
angelica.huerta.d@gmail.com
5
Independent Researcher, 1001 Tirana, Albania; alenroidhrab@gmail.com
6
comeMed Teleconsulting, Colonia Roma, Mexico City 6700, Mexico; dr.ericortiz.oncomed@gmail.com
* Correspondence: cristian.castillo2@gmail.com; Tel.: +52-5574302237
Abstract: Background: The current pandemic caused by SARS-CoV-2 is an acute illness of global
concern. SARS-CoV-2 is an infectious disease caused by a recently discovered coronavirus. Most
people who get sick from COVID-19 experience either mild, moderate, or severe symptoms. In
order to help make quick decisions regarding treatment and isolation needs, it is useful to determine
which significant variables indicate infection cases in the population served by the Tijuana General
Hospital (Hospital General de Tijuana). An Artificial Intelligence (Machine Learning) mathematical
model was developed in order to identify early-stage significant variables in COVID-19 patients.
Methods: The individual characteristics of the study subjects included age, gender, age group,
symptoms, comorbidities, diagnosis, and outcomes. A mathematical model that uses supervised
learning algorithms, allowing the identification of the significant variables that predict the diagnosis
of COVID-19 with high precision, was developed. Results: Automatic algorithms were used to
analyze the data: for Systolic Arterial Hypertension (SAH), the Logistic Regression algorithm showed
results of 91.0% in area under ROC (AUC), 80% accuracy (CA), 80% F1 and 80% Recall, and 80.1%
precision for the selected variables, while for Diabetes Mellitus (DM) with the Logistic Regression
algorithm it obtained 91.2% AUC, 89.2% accuracy, 88.8% F1, 89.7% precision, and 89.2% recall for the
selected variables. The neural network algorithm showed better results for patients with Obesity,
obtaining 83.4% AUC, 91.4% accuracy, 89.9% F1, 90.6% precision, and 91.4% recall. Conclusions:
Statistical analyses revealed that the significant predictive symptoms in patients with SAH, DM, and
Obesity were more substantial in fatigue and myalgias/arthralgias. In contrast, the third dominant
symptom in people with SAH and DM was odynophagia.
Keywords: machine learning; COVID-19; identification
1. Introduction
A novel coronavirus, known as Severe Acute Respiratory Syndrome (SARS-CoV-2),
was identified in December 2019 as the cause of a respiratory illness called Coronavirus
Disease 2019, or COVID-19 [1]. The origin of this virus is not yet confirmed, but an analysis
of its genetic sequence suggests it is phylogenetically related to bat viruses similar to SARS
(severe acute respiratory syndrome), making bats a possible key reservoir [2]. Symptoms
of COVID-19 infection appear after an incubation period of approximately 5.2 days [3].
The period from the onset of COVID-19 symptoms to death ranges from 6 to 41 days with a
median of 14 days [4]. This period depends largely on the age and the state of the patient’s
immune system [4].
Information 2021, 12, 490. https://doi.org/10.3390/info12120490 https://www.mdpi.com/journal/information