AbstractHigh concentrations ground-level ozone is a harmful air pollutant that affects human, animals, and plants. Breathing ground-level ozone can activate a diversity of health problems, especially for the elderly, children, and people who have asthma. Ground-level ozone can also have dangerous results on vegetation and crops. The purposed of this research is to build the support vector regression model for predicting the hourly ground-level ozone concentration. On the model building Pearson correlation is used to find the relationship between ozone, which is a dependent variable, and several independent variables such as temperature, relative humidity, nitrogen dioxide and carbon monoxide. The air pollutant and meteorological data since 2012 to 2015 had been collected at the northern air quality station in urban area of warm climate from the pollution control department, Chiang Mai, Thailand. The results from correlation analysis show that temperature has the highest positive relationship with ozone, whereas relative humidity has the highest negative relationship with ozone. We use k-means clustering as a tool to categorize ozone into three groups and then assign weight for each group. After that, we apply normalization to convert ozone, temperature, and relative humidity values to be on a same scale. In the training and testing processes, we use normalized data and cluster weight as inputs of the model. In the evaluation phase, we compare the predictive performance of support vector regression and multiple linear regression models based on the three metrics: root mean squared error, index of agreement, and mean absolute percentage. Index Termsurban air pollutant, ozone prediction, support vector regression, k-means clustering. I. INTRODUCTION Ground-level ozone (or Tropospheric O3) is a pollutant that is dangerous to human and vegetation [1, 2]. People with asthma and lung disease, elderly, children, and people who are active outdoors might be particularly sensitive to this ground-level ozone. High ground-level ozone situation are usually found in the summer when the formation of ozone is active over pollutant reactions relating to nitrogen dioxide (NO 2 ), carbon monoxide (CO), sulfur dioxide (SO 2 ) and PM10 particles. However, ozone concentrations are K. Chaiyakhan is a lecturer with the Computer Engineering Department, Rajamangala University of Technology Isan, Muang, Nakhon Ratchasima, Thailand (e-mail: kedkarnc@hotmail .com). P. Chujai is a lecturer with the Electrical Technology Education Department, Faculty of Industrial Education and Technology, King Mongkuts University of Technology Thonburi, Bangkok, Thailand (e- mail: pasapitchchujai@gmail .com). N. Kerdprasop is an associate professor with the School of Computer Engineering, Suranaree University of Technology, Nakhon Ratchasima, Thailand (e-mail: nittaya.k@gmail.com). K. Kerdprasop is an associate professor and chair of the School of Computer Engineering, Suranaree Universiyof Technology, Nakhon Ratchasima, Thailand (e-mail: kittisakThailand@gmail .com). sensitive to climate factors involving temperature and relative humidity. In recent years, several methods have been proposed for predicting ozone concentration. The prediction of daily ozone concentration maxima in the urban atmosphere have been proposed by [3]. They evaluated predictors prior to the selection of variables for the model by computing the correlations between O 3 and other pollutants, e.g. CO, NO, NO 2 , SO 2 , suspended particles as well as meteorological variable e.g. wind speed, temperature, relative humidity and cloud cover. Multiple linear regression model was constructed with forward stepwise method and calibrated using data collected over a period of two years and predicting performance was evaluated by computing the daily O 3 concentration maxima over the subsequent two years and comparing the prediction to measured values. Fuzzy time series [4] is also used for predicting daily ozone concentration maxima. The research proposed two new fuzzy time series based on a two-stage linguistic partition method to predict daily maximum O 3 concentration. In stage 1, they partitioned the universe of discourse into seven intervals using the fuzzy time series based on the cumulative probability distribution approach (CPDA). In stage 2, they repartitioned each interval into three subintervals using the CPDA and the uniform discretion method (UDM). The proposed methods both show a considerably increased performance in predicting daily maximal ozone concentration. Artificial neural network is also another effective machine learning algorithm used to forecast ozone concentration. Ozone concentration forecast method based on genetic algorithm optimized with back propagation neural networks and support vector machine data classification has been proposed by [5]. Back propagation neural network (BPNN) was optimized using Genetic Algorithm (GA) to get higher forecast performance. Support vector machine (SVM) and GA optimized BPNN were combined to forecast ozone concentration in Beijing. The dataset from March 2009 to July 2009 consists of temperature, humidity, wind velocity, and UV radiation. The models were tested using the records of August 2009. The prediction model shows a great forecasting performance that could be applied to the real-life ozone forecast in Beijing. Support vector machine (SVM) becomes popular for ground-level ozone prediction [6]. SVM can be operated either in regression or classification for prediction. As for the standard support vector machine, they found that SVM is sensitive to class imbalance. Therefore, a cost-sensitive classification scheme is proposed for the standard support vector classification model (S-SVC) in order to investigate whether the class imbalance troubles S-SVC. The S-SVC with such scheme is named as CS-SVC. Hourly Ground-level Ozone Concentration Prediction using Support Vector Regression Kedkarn Chaiyakhan, Pasapitch Chujai, Nittaya Kerdprasop, and Kittisak Kerdprasop Manuscript received December 10, 2016; revised January 16, 2017. Proceedings of the International MultiConference of Engineers and Computer Scientists 2017 Vol I, IMECS 2017, March 15 - 17, 2017, Hong Kong ISBN: 978-988-14047-3-2 ISSN: 2078-0958 (Print); ISSN: 2078-0966 (Online) IMECS 2017