Air Quality Prediction based on Decision Tree using Machine Learning Abstract-Air pollution has become a severe problem due to urbanization, industrialization, and the burning of fossil fuels, among other factors. This paper focuses on the use of data mining techniques for predicting air quality using machine learning. The paper highlights the impact of pollutants such as PM2.5 (particulate matter 2.5), PM10 (particulate matter 10), CO (carbon monoxide), NOx (oxides of nitrogen), SO2 (Sulphur dioxide), and O3 (ozone) on human health, which include respiratory and cardiovascular diseases, asthma attacks, strokes, and even death. We propose using data mining and artificial intelligence techniques to solve the problem. Decision trees are used for classification and regression tasks and work by building a tree-like structure of decisions and their possible outcomes. The tree is constructed by recursively splitting the dataset based on the feature that provides the highest information gain or reduction in impurity until a stopping criterion is met. Decision trees are easy to understand and can handle both continuous and categorical features, making them a popular algorithm in machine learning. The paper also discusses the importance of data mining in machine learning and its ability to identify patterns and relationships that would have otherwise gone unnoticed. This paper offers a practical solution to predict air quality of Bengaluru for the next coming month by analyzing the data from the previous 1 year. This provides insights into the use of decision trees and data mining for solving complex problems. Keywords- Data mining, Artificial intelligence (AI), Air Quality Index (AQI), Decision tree I. INTRODUCTION Many sources contribute air pollutants to the atmosphere, changing its chemical composition and impacting the biotic environment. For survival and the health of living things, maintaining proper air quality is crucial. However, there have been an increasing number of lung and respiratory disorders in recent years because of the rise in pollutants, as depicted in Fig. 1. Fig.1 Scenario diagram for Air Pollution by releasing harmful gases in the air. As shown in fig.1 the harmful air is contaminated due to urbanization, industrialization, burning fossil fuels, release of the harmful gases (like NOx, CO, SO2, O3, etc.) from old vehicle exhaust and emissions from agriculture and thermal power plants. The combustion of gasoline, oil, diesel, or other fuels releases particles that are included under PM2.5[1], while PM10 includes the particles released at construction sites, landfills, agricultural operations, wildfires, and waste burning. The combustion of wood, charcoal, or other fuels releases CO, while the combustion of fuel in motor vehicles and other fuel burning processes releases NOx. Sulphur dioxide (SO2) [2-3] is emitted when fossil fuels are used in power plants and other industrial facilities. Ozone, a naturally occurring form of oxygen found in the earth's stratosphere, helps to absorb ultraviolet light from the sun. All these factors contribute to respiratory illnesses, asthma attacks, heart attacks, miscarriages, a decline in lung function, and other conditions. These problems can be solved by using data mining and Machine Learning [4-8] to identify patterns and trends that may not be apparent to humans. Soumyalatha Naveen School of Computer Science REVA University Bangalore, India M.S. Upamanyu School of Electronics and Communication REVA University Bangalore, India upamanyums@gmail.com Karun Chakki School of Electronics and Communication REVA University Bangalore, India Chandan M School of Electronics and Communication REVA University Bangalore, India P Hariprasad School of Electronics and Communication REVA University Bangalore, India 2023 International Conference on Smart Systems for applications in Electrical Sciences (ICSSES) | 979-8-3503-4729-6/23/$31.00 ©2023 IEEE | DOI: 10.1109/ICSSES58299.2023.10200535 Authorized licensed use limited to: REVA UNIVERSITY. Downloaded on December 14,2023 at 06:09:30 UTC from IEEE Xplore. Restrictions apply.