Hybrid Self-Attention NEAT: A novel evolutionary approach to improve the NEAT algorithm Saman Khamesian s.khamesian@gmail.com Hamed Malek h_malek@sbu.ac.ir ABSTRACT This article presents a “Hybrid Self-Attention NEAT” method to improve the original NeuroEvolution of Augmenting Topologies (NEAT) algorithm in high-dimensional inputs. Although the NEAT algorithm has shown a significant result in different challenging tasks, as input representations are high dimensional, it cannot create a well-tuned network. Our study addresses this limitation by using self-attention as an indirect encoding method to select the most important parts of the input. In addition, we improve its overall performance with the help of a hybrid method to evolve the final network weights. The main conclusion is that Hybrid Self- Attention NEAT can eliminate the restriction of the original NEAT. The results indicate that in comparison with evolutionary algorithms, our model can get comparable scores in Atari games with raw pixels input with a much lower number of parameters. I. INTRODUCTION D ata plays an important role nowadays and is essential to produce and evaluate software and models. On the other hand, data dimensions have gradually increased with the spread of information and have posed challenging problems for scientists. In machine learning, which researchers have welcomed today, some previous studies are not compatible with these types of changes. Deep neural networks, which can learn high-dimensional representations, have performed far better than other algorithms in many areas like computer vision [1, 2, 3, 4], speech processing [5, 6], and reinforcement learning [7, 8]. These models are too large and complex with hundreds of millions of parameters which require plenty of computational resources. On the other hand, their performance strongly depends on the network architecture and parameter configuration [9]. In recent years, many studies in deep learning have focused on discovering specialized network architectures that apply to specific problems. Although there is a great deal of variation between deep neural network architectures, no specific rule has been established for choosing between them [10]. Consequently, finding the right design and hyper-parameters is essentially reduced to a black box optimization process. These configuration settings are usually determined from previous studies because manual testing and evaluation is a wearisome and time-consuming process that requires experience and expertise [9, 10]. An alternative approach is using neuroevolution algorithms to aid deep neural networks in building the architectures and learning hyper-parameters [11]. One of the most popular algorithms in the field of neuroevolution is the NEAT algorithm, which was introduced in 2002 by Stanley and Miikkulaineny [12]. The method solves the problems encountered in previous algorithms called Topology and Weight Evolving Artificial Neural Network (TWEANN) [13] by elevating the evolutionary operations between individuals of different lengths, evolving networks by adding neurons or connections in each iteration and protecting structural innovations by organizing them in species [13]. Due to using direct encoding in the NEAT algorithm, it seems that problems with high-dimensional input space (such as image processing tasks) make it prone to generate vast and complex networks. In this way, proper architecture is either done for a long time or may not be optimal. 1