978-1-7281-4341-5/20/$31.00 ©2020 IEEE
Cython for Speeding-up Genetic Algorithm
Ahmed Fawzy Gad
Information Technology
Faculty of Computers and Information
Menoufia, Egypt
ahmed.fawzy@ci.menofia.edu.eg
Fatima Ezzahra Jarmouni
École Nationale Supérieure d'Informatique et d'Analyse des
Systèmes (ENSIAS)
Rabat, Morocco
fatimaezzahra-jarmouni@um5s.net.ma
Abstract— This paper proposes a library for implementing
the genetic algorithm using Python mainly in NumPy and
speeding-up its execution using Cython. The preliminary
Python implementation is inspected for possible optimizations.
The 4 main changes include statically defining data types for the
NumPy arrays, specifying the data type of the array elements in
addition to the number of dimensions, using indexing for
looping through the arrays, and finally disabling some
unnecessary features in Cython. Using Cython, the NumPy
array processing is 1250 times faster than CPython. The
Cythonized version of the genetic algorithm is 18 times faster
than the Python version.
Keywords— Genetic Algorithm, Cython, Python, NumPy,
Optimization
I. INTRODUCTION
Machine learning (ML) [1] is the field of making the
machine learns by experience which is a time-consuming
process. Some models such as artificial neural network (ANN)
[2] may take days and weeks to be optimize it for finding the
best parameters. Even a simple neural network with 3 fully-
connected layers with 500, 200, and 10 neuros respectively
has over 1 million parameters, which is a huge number of
parameters to be optimized.
Optimization techniques such as the genetic algorithm
(GA) [3] are used for training such ML models to reach the
highest accuracy or the least error. The steps followed by the
genetic algorithm are given in Fig. 1. Based on calculating the
fitness value for each individual in the population, the best
individuals are selected as parents. Using the 2 variants
(crossover and mutation), new offspring are created. The
process continues for several generations (i.e. iterations).
Figure 1. Genetic algorithm steps [3].
To find the good values for a model with a huge number
of parameters, it may require hundreds or thousands of
generations. The time per iteration increases according to the
number of parameters to be optimized. This ends-up with
overall training time that might last for some hours, days, or
weeks.
To reduce the amount of time, this paper proposes using
Cython [4] for speeding-up the execution of an existing
GitHub project [5] that builds a Python implementation for the
genetic algorithm. Cython is a superset of Python that has a
single power feature of being a statically typed language rather
than dynamic in Python.
In Python, the source code is compiled into bytecode based
on the type of compiler (CPython for C compiler). It is
compiled only once but could not be executed at its current
form. An interpreter takes place between the compiler and the
machine which runs in a virtual environment. This has the
benefit of making Python cross-platform but unfortunately
makes it slower than the pure compiled languages such as C.
Python also has some features that makes it slow such as being
a dynamic language in which the data type is assigned
dynamically to the variables. This adds extra time to deduce
the right data type.
This paper builds a Python library for implementing the
genetic algorithm. Being under active development, the
library currently supports optimizing a function that accepts
multiple inputs. The library allows the user to prepare an
initial population, evolve the solutions in several generations
using crossover and mutation, calculate fitness values for
solutions per population to select the best ones as parents for
mating, and repeating the process for several generations.
Based on the Python implementation, Cython is used for
speeding-up the execution of the algorithm. The code is
compiled directly into machine code and thus no in-between
interpreter exists in addition to supporting static data types that
saves the time for dynamically assigning a data type to
variables.
This paper is organized as follows. Section I gives an
introduction, section II discusses the Python implementation
of the genetic algorithm, section III illustrates how Cython is
used for optimizing the Python implementation, section IV
discusses the experimental results, and finally section V draws
some conclusions.
II. PYTHON IMPLEMENTATION OF THE GENETIC ALGORITHM
This section discusses the implementation of the genetic
algorithm in Python mainly using NumPy. The library is
currently designed for optimizing a function that has several
parameters and the target is to find their best values to
maximize the output. The user can freely specify another
function that could do another task. The function is of the
following form where the parameters are
ଵ
to
, the
equation inputs are
ଵ
to
, and the output is .
ൌ
ଵ
ଵ
ଶ
ଶ
ଷ
ଷ
⋯
Given a set of inputs, the genetic algorithm is to optimize
the function to find the best values for the parameters that
maximize the output.