XXX-X-XXXX-XXXX-X/XX/$XX.00 ©20XX IEEE
Exploring relationship between COVID-19 cases
and eating habits using data of London boroughs
Abdulhadi Algbear
College of Computer Science and
Engineering
University of Jeddah
Jeddah, Saudi Arabia
Abdulhadi.IT@hotmail.com
Mohammed Ali Alqarni
College of Computer Science and
Engineering
University of Jeddah
Jeddah, Saudi Arabia
alqarni@uj.edu.sa
Muhammad Murtaza Khan
College of Computer Science and
Engineering
University of Jeddah
Jeddah, Saudi Arabia
mkhan@uj.edu.sa
Muhammad Usman Ilyas
College of Computer Science and
Engineering
University of Jeddah
Jeddah, Saudi Arabia
milyas@uj.edu.sa
Abstract—COVID-19 has affected everyone in the world in
one way or another. At the time of this writing, there are
approximately 110.9 million reported cases with approximately
2.4 million deaths across the world this makes the ratio of
deaths to total infections a little over 2%. To better understand
the reasons for COVID-19 related infections and deaths, efforts
are underway to uncover relationships between them and
existing health conditions. Some studies have focused on causes
of infection and use of preventive equipment for protection,
while others have focused on identifying relationships between
deaths and existing diabetes, heart condition or hyper-tension.
Research has established that pre-existing health conditions
can be associated to eating habits of people. Therefore, we have
tried to determine if there is any relationship between eating
habits of people and COVID-19 infections. This has been done
by making use of data related to purchases made by residents
from Tesco supermarket, for London Boroughs. The data
related to pre-existing health conditions, for same regions, was
obtained from the London Datastore. Our study indicates that
for the London Boroughs’ data, food products containing
alcohol, carbohydrates and fats are weakly correlated with the
number of COVID-19 cases. We believe that these results
warrant a more detailed investigation of causality.
Keywords—COVID-19, correlation, mutual information,
regression, food groups
I. INTRODUCTION
Modern data aggregation methods have made large,
diverse data sets available that can be used to determine and
establish relationships between different facets of life. Thus,
data about jobs, economy, housing, health, environment,
purchases is available and can be used to determine direct
relationships at scales at which it was not previously
possible. Availability of large amounts of data has ushered in
a new era in data analytics. Thus, when Coronavirus disease
2019 (COVID-19), also known as Severe acute respiratory
syndrome Coronavirus 2 (SARS-COV-2), began spreading,
data collection along with its availability and analysis
became important, albeit less than finding a treatment or
developing a vaccine, but still important for tracking the
spread of the disease and identifying super spreaders [1][2].
Considering the fact that approximately 110.9 million people
have been infected with the virus [3], monitoring the spread
of COVID-19 is still an important area of research.
A secondary area of focus for researchers has been to
understand if there is any relationship between pre-existing
health conditions and COVID-19 infections or deaths. The
myths and conspiracies surrounding infection of COVID-19
due to stress were addressed by Georgiou et al. in [5]. The
authors clarified the myth that people with stress are not
more likely to be affected by COVID-19 compared to others.
In [6], Jordan et al. observed that different studies based on
data collected from Wuhan, Italy and UK citing increased
risk of COVID-19 related deaths for people suffering from
pre-existing health conditions. However, they highlighted
that these studies comprised of a small population ranging
from 100 to 40,000 participants with data that is not readily
available and, in some cases, incomplete. Therefore, there is
a need for improved data acquisition for analysis and, hence,
reaching better conclusions. It was highlighted in [7], based
on a study by Chinese Center for Disease Control (CDC) of
approximately 44,000 lab-tested positive cases, that
advanced age, heart conditions, cancer, hyper-tension,
chronic respiratory diseases, diabetes increase the risk of
fatality in case of a Coronavirus infection. Data collected
from patients in China suggested that smoking and obesity
were linked with higher risk of severe infection and death
[8]. In another study, Stefan et al. [9] identified that patients
with obesity are at increased risk for severe COVID-19
symptoms.
In this work we try to identify if eating habits have a
direct relationship with the number of COVID-19 cases in a
particular region. This is based on the assumption that eating
habits generally effect the health of an individual, since pre-
existing conditions seem to have a relationship with COVID-
19. Therefore, it will be interesting to see if any food group
has any relationship with the number of COVID-19 cases in
a geographic region. To conduct this analysis, data for
COVID-19 cases, along with the data of pre-existing health
conditions and data related to eating habits of people is
required for a particular region. All of this data was not
available at the same spatial resolution and for the same
temporal window. However, we were able to compile data
from different sources to obtain data at the resolution of
Boroughs for London region.
The rest of the paper is organized as follows. Section II
introduces the sources and type of data used in this study.
Section III presents a correlation-based analysis between
2021 National Computing Colleges Conference (NCCC) | 978-1-7281-6719-0/20/$31.00 ©2021 IEEE | DOI: 10.1109/NCCC49330.2021.9428879