Air Quality Prediction based on Decision Tree
using Machine Learning
Abstract-Air pollution has become a severe problem due
to urbanization, industrialization, and the burning of fossil
fuels, among other factors. This paper focuses on the use
of data mining techniques for predicting air quality using
machine learning. The paper highlights the impact of
pollutants such as PM2.5 (particulate matter 2.5), PM10
(particulate matter 10), CO (carbon monoxide), NOx
(oxides of nitrogen), SO2 (Sulphur dioxide), and O3
(ozone) on human health, which include respiratory and
cardiovascular diseases, asthma attacks, strokes, and even
death. We propose using data mining and artificial
intelligence techniques to solve the problem. Decision
trees are used for classification and regression tasks and
work by building a tree-like structure of decisions and
their possible outcomes. The tree is constructed by
recursively splitting the dataset based on the feature that
provides the highest information gain or reduction in
impurity until a stopping criterion is met.
Decision trees are easy to understand and can handle both
continuous and categorical features, making them a
popular algorithm in machine learning. The paper also
discusses the importance of data mining in machine
learning and its ability to identify patterns and
relationships that would have otherwise gone unnoticed.
This paper offers a practical solution to predict air quality
of Bengaluru for the next coming month by analyzing the
data from the previous 1 year. This provides insights into
the use of decision trees and data mining for solving
complex problems.
Keywords- Data mining, Artificial intelligence (AI), Air
Quality Index (AQI), Decision tree
I. INTRODUCTION
Many sources contribute air pollutants to the
atmosphere, changing its chemical composition and
impacting the biotic environment. For survival and
the health of living things, maintaining proper air
quality is crucial. However, there have been an
increasing number of lung and respiratory disorders
in recent years because of the rise in pollutants, as
depicted in Fig. 1.
Fig.1 Scenario diagram for Air Pollution by releasing harmful
gases in the air.
As shown in fig.1 the harmful air is contaminated
due to urbanization, industrialization, burning fossil
fuels, release of the harmful gases (like NOx, CO,
SO2, O3, etc.) from old vehicle exhaust and
emissions from agriculture and thermal power
plants.
The combustion of gasoline, oil, diesel, or other
fuels releases particles that are included under
PM2.5[1], while PM10 includes the particles
released at construction sites, landfills, agricultural
operations, wildfires, and waste burning. The
combustion of wood, charcoal, or other fuels
releases CO, while the combustion of fuel in motor
vehicles and other fuel burning processes releases
NOx. Sulphur dioxide (SO2) [2-3] is emitted when
fossil fuels are used in power plants and other
industrial facilities. Ozone, a naturally occurring
form of oxygen found in the earth's stratosphere,
helps to absorb ultraviolet light from the sun. All
these factors contribute to respiratory illnesses,
asthma attacks, heart attacks, miscarriages, a decline
in lung function, and other conditions. These
problems can be solved by using data mining and
Machine Learning [4-8] to identify patterns and
trends that may not be apparent to humans.
Soumyalatha
Naveen
School of Computer
Science
REVA University
Bangalore, India
M.S. Upamanyu
School of Electronics and
Communication
REVA University
Bangalore, India
upamanyums@gmail.com
Karun Chakki
School of Electronics
and
Communication
REVA University
Bangalore, India
Chandan M
School of Electronics
and Communication
REVA University
Bangalore, India
P Hariprasad
School of Electronics
and
Communication
REVA University
Bangalore, India
2023 International Conference on Smart Systems for applications in Electrical Sciences (ICSSES) | 979-8-3503-4729-6/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICSSES58299.2023.10200535
Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on December 14,2023 at 06:09:30 UTC from IEEE Xplore. Restrictions apply.