Q-LEARNING FOR DEVELOPMENT OF ADAPTIVE SIGNAL CONTROL ON URBAN INTERSECTION Daniela Koltovska Nechoska 1 Abstract: The intelligent agent technology represents the new approach employed in the development and design of adaptive control strategies. These are strategies that incorporate a higher level of intelligence and are capable of self-learning and experience-based decision making. In this paper, the adaptive signal control strategy on urban intersection has been developed and evaluated. The techniques of Reinforcement Learning, as well as the Q – learning algorithm, have been applied. The developed adaptive strategy has been tested under conditions of micro-simulation by applying the VISSIM simulator. In order to assess the feasibility of the designed strategy, the intelligent agent results have been compared to those obtained during the simulations in the case of fixed time and actuated control. The testing оf the strategy has been performed on a reаl urban intersection. Both strategies reduced the average delay compared to existing fixed time signal control. Key words: traffic signals, adaptive control, urban intersection, artificial intelligence, Q-learning 1. INTRODUCTION For a long time, it was believed that the systems responding to real time traffic would enable significant benefits. However, numerous limitations have appeared such as the existence of the models with a high level of detail precision, the uncertainty in predicting future traffic flows, the difficulty in arrival time estimation, the lack of self-adjusting mechanism [1]. The emergence of the intelligent agent concept, a significant move in the overall information science has been made. Nowadays this concept is applied in traffic when developing adaptive control strategies. The idea behind is the autonomous entities known as agents to start learning to behave in an optimal way by direct interaction with the system. By applying machine learning (ML) algorithms that are based on rewards or penalties depending on the results obtained in the actions selected by the agent, the optimal policy trying to optimize the traffic flow can be calculated [3]. The control strategy presented in this paper is performed by an agent. In order to embed the learning feature in the agent, the RL method is applied, as well as the Q- learning algorithm. The developed adaptive strategy has been tested under conditions of micro-simulation by applying the VISSIM simulator. In order to assess the feasibility of the designed strategy, the intelligent agent results have been compared to those obtained during the simulations in the case of fixed time and actuated control. The testing оf the strategy has been performed on a reаl urban intersection. Both strategies reduced the average delay compared to existing fixed time signal control. This paper is organized as follows. The second part describes the reinforcement learning technique (RL) and Q learning algorithm. The third section presents the Q - learning control at the individual intersection. Obtained simulation results are presented in the fourth part. Conclusions are presented at the end of the paper. 2. REINFORCEMENT LEARNING AND Q LEARNING ALGORITHM Reinforcement learning (RL) is a technique well known in AI and machine learning (ML) communities. Reinforcement learning is a suitable technique for attempting to solve the traffic signal 1 Daniela Koltovska Nechoska, PhD, Faculty of Technical Sciences - Bitola, email: daniela.koltovska@tfb.uklo.edu.mk