978-1-7281-4341-5/20/$31.00 ©2020 IEEE Cython for Speeding-up Genetic Algorithm Ahmed Fawzy Gad Information Technology Faculty of Computers and Information Menoufia, Egypt ahmed.fawzy@ci.menofia.edu.eg Fatima Ezzahra Jarmouni École Nationale Supérieure d'Informatique et d'Analyse des Systèmes (ENSIAS) Rabat, Morocco fatimaezzahra-jarmouni@um5s.net.ma Abstract— This paper proposes a library for implementing the genetic algorithm using Python mainly in NumPy and speeding-up its execution using Cython. The preliminary Python implementation is inspected for possible optimizations. The 4 main changes include statically defining data types for the NumPy arrays, specifying the data type of the array elements in addition to the number of dimensions, using indexing for looping through the arrays, and finally disabling some unnecessary features in Cython. Using Cython, the NumPy array processing is 1250 times faster than CPython. The Cythonized version of the genetic algorithm is 18 times faster than the Python version. Keywords— Genetic Algorithm, Cython, Python, NumPy, Optimization I. INTRODUCTION Machine learning (ML) [1] is the field of making the machine learns by experience which is a time-consuming process. Some models such as artificial neural network (ANN) [2] may take days and weeks to be optimize it for finding the best parameters. Even a simple neural network with 3 fully- connected layers with 500, 200, and 10 neuros respectively has over 1 million parameters, which is a huge number of parameters to be optimized. Optimization techniques such as the genetic algorithm (GA) [3] are used for training such ML models to reach the highest accuracy or the least error. The steps followed by the genetic algorithm are given in Fig. 1. Based on calculating the fitness value for each individual in the population, the best individuals are selected as parents. Using the 2 variants (crossover and mutation), new offspring are created. The process continues for several generations (i.e. iterations). Figure 1. Genetic algorithm steps [3]. To find the good values for a model with a huge number of parameters, it may require hundreds or thousands of generations. The time per iteration increases according to the number of parameters to be optimized. This ends-up with overall training time that might last for some hours, days, or weeks. To reduce the amount of time, this paper proposes using Cython [4] for speeding-up the execution of an existing GitHub project [5] that builds a Python implementation for the genetic algorithm. Cython is a superset of Python that has a single power feature of being a statically typed language rather than dynamic in Python. In Python, the source code is compiled into bytecode based on the type of compiler (CPython for C compiler). It is compiled only once but could not be executed at its current form. An interpreter takes place between the compiler and the machine which runs in a virtual environment. This has the benefit of making Python cross-platform but unfortunately makes it slower than the pure compiled languages such as C. Python also has some features that makes it slow such as being a dynamic language in which the data type is assigned dynamically to the variables. This adds extra time to deduce the right data type. This paper builds a Python library for implementing the genetic algorithm. Being under active development, the library currently supports optimizing a function that accepts multiple inputs. The library allows the user to prepare an initial population, evolve the solutions in several generations using crossover and mutation, calculate fitness values for solutions per population to select the best ones as parents for mating, and repeating the process for several generations. Based on the Python implementation, Cython is used for speeding-up the execution of the algorithm. The code is compiled directly into machine code and thus no in-between interpreter exists in addition to supporting static data types that saves the time for dynamically assigning a data type to variables. This paper is organized as follows. Section I gives an introduction, section II discusses the Python implementation of the genetic algorithm, section III illustrates how Cython is used for optimizing the Python implementation, section IV discusses the experimental results, and finally section V draws some conclusions. II. PYTHON IMPLEMENTATION OF THE GENETIC ALGORITHM This section discusses the implementation of the genetic algorithm in Python mainly using NumPy. The library is currently designed for optimizing a function that has several parameters and the target is to find their best values to maximize the output. The user can freely specify another function that could do another task. The function is of the following form where the parameters are to , the equation inputs are to , and the output is . ൌ ൅ ൅ ൅⋯൅ Given a set of inputs, the genetic algorithm is to optimize the function to find the best values for the parameters that maximize the output.