Vol.:(0123456789) 1 3 Environmental Science and Pollution Research https://doi.org/10.1007/s11356-022-21723-8 GIS APPLIED TO SOIL-AGRICULTURAL HEALTH FOR ENVIRONMENTAL SUSTAINABILITY Machine learning‑based time series models for efective CO 2 emission prediction in India Surbhi Kumari 1  · Sunil Kumar Singh 1 Received: 20 January 2022 / Accepted: 25 June 2022 © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022 Abstract China, India, and the USA are the countries with the highest energy consumption and CO 2 emissions globally. As per the report of datacommons.org, CO 2 emission in India is 1.80 metric tons per capita, which is harmful to living beings, so this paper presents India’s detrimental CO 2 emission efect with the prediction of CO 2 emission for the next 10 years based on univariate time-series data from 1980 to 2019. We have used three statistical models; autoregressive-integrated moving aver- age (ARIMA) model, seasonal autoregressive-integrated moving average with exogenous factors (SARIMAX) model, and the Holt-Winters model, two machine learning models, i.e., linear regression and random forest model and a deep learning- based long short-term memory (LSTM) model. This paper brings together a variety of models and allows us to work on data prediction. The performance analysis shows that LSTM, SARIMAX, and Holt-Winters are the three most accurate models among the six models based on nine performance metrics. Results conclude that LSTM is the best model for CO 2 emission prediction with the 3.101% MAPE value, 60.635 RMSE value, 28.898 MedAE value, and along with other performance metrics. A comparative study also concludes the same. Therefore, the deep learning-based LSTM model is suggested as one of the most appropriate models for CO 2 emission prediction. Keywords Time series forecasting · Linear regression · Random forest regressor · Air pollution · CO 2 emissions · Holt- Winters · LSTM Introduction According to the Ministry of Statistics and Programme Implementation, UN (United Nations, Department of Eco- nomic and Social Afairs, Population Division 2019), the current population of India is 1,400,517,328 as of January 2022; based on interpolation of the latest United Nations data, India is just falling behind China and standing at sec- ond in the world. It would catch China and even surpass it shortly if it continues to grow at the same rate. With this, environmental consequences which may arise are many but CO 2 emission remains the topmost concern because of the problems which ensue due to its increased rate (Bonga and Chirowa 2014). According to UN data, India’s CO 2 emis- sion rose faster than the world average of 0.7%. Increased CO 2 will accentuate the world’s food and water crisis and increase the incidences of natural disasters. Increasing CO 2 emissions can afect human health in two ways; directly and indirectly. It afects directly when inhaled in high dosage and can be the cause of serious diseases such as breathlessness, blindness, dizziness, and even delirium (Ağbulut 2022). Global problems such as climate change, acid rain, and global warming can also be seen in the indirect form of high CO 2 emissions (Ağbulut 2022; Bakay and Ağbulut 2021; Liu et al. 2020). All these forms of emissions are highly hazardous for human beings and the environment. The increased fooding, landslide, cloud bursts, etc., are already evident and would further increase if we continue to go in the same way. As per the (The Lancet 2016) report, approximately 6.5 million peoples die annually due to severe diseases caused by air pollution worldwide. And this number is greater than Responsible Editor: V.V.S.S. Sarma * Sunil Kumar Singh sksingh@mgcub.ac.in; sunilsingh.jnu@gmail.com Surbhi Kumari surbhigupta387@gmail.com 1 Dept. of Computer Science and Information Technology, Mahatma Gandhi Central University, Motihari, Bihar, India