Genetic Programming with External Memory in Sequence
Recall Tasks
Mihyar Al Masalma
Dalhousie University
Halifax, NS, Canada
m.almasalma@dal.ca
Malcolm I. Heywood
Dalhousie University
Halifax, NS, Canada
mheywood@cs.dal.ca
ABSTRACT
Partially observable tasks imply that a learning agent has to recall
previous state in order to make a decision in the present. Recent
research with neural networks have investigated both internal and
external memory mechanisms for this purpose, as well as proposing
benchmarks to measure their efectiveness. These developments
motivate our investigation using genetic programming and an ex-
ternal linked list memory model. A thorough empirical evaluation
using a scalable sequence recall benchmark establishes the under-
lying strength of the approach. In addition, we assess the impact of
decisions made regarding the instruction set and characterize the
sensitivity to noise / obfuscation in the defnition of the benchmarks.
Compared to neural solutions to these benchmarks, GP extends the
state-of-the-art to greater task depths than previously possible.
CCS CONCEPTS
· Computing methodologies → Genetic programming; Se-
quential decision making.
KEYWORDS
modularity, external memory, partially observable
ACM Reference Format:
Mihyar Al Masalma and Malcolm I. Heywood . 2022. Genetic Programming
with External Memory in Sequence Recall Tasks. In Genetic and Evolutionary
Computation Conference Companion (GECCO ’22 Companion), July 9–13,
2022, Boston, MA, USA. ACM, New York, NY, USA, 4 pages. https://doi.org/
10.1145/3520304.3528883
1 INTRODUCTION
Most learning agents are purely reactive, which is to say that their
output is a function of the current input alone. This is sufcient
for supervised learning tasks such as regression and classifcation
or unsupervised learning tasks such as clustering. However, more
general cognitive tasks, as encountered under partially observable
state,
1
imply that an agent has to interact with the environment and
recall events from the past in order to make decisions in the present.
With this in mind, there has been something of a resurgence of
interest in agents that support memory, particularly with respect
1
For example, as often experienced in robotics, planning and process control.
Permission to make digital or hard copies of part or all of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for proft or commercial advantage and that copies bear this notice and the full citation
on the frst page. Copyrights for third-party components of this work must be honored.
For all other uses, contact the owner/author(s).
GECCO ’22 Companion, July 9–13, 2022, Boston, MA, USA
© 2022 Copyright held by the owner/author(s).
ACM ISBN 978-1-4503-9268-6/22/07.
https://doi.org/10.1145/3520304.3528883
to neural networks (e.g. [5ś7, 10]). One motivation for this is that
although recurrent neural networks are Turing Complete [11], this
does not mean that fnding the recurrent connectivity appropriate
for solving a partially observable task is straightforward.
Similar observations have motivated the use of memory with
genetic programming (GP). Thus, adding indexed memory to tree
structured GP also supports Turing Completeness [13], but does not
necessarily result in the efcient development of internal state rep-
resentations [1, 2]. Indeed, Langdon [8] in particular demonstrated
that a prior decomposition of the memory interface (relative to an
external data structure) can be benefcial when evolving solutions
to partially observable tasks, i.e. signals controlling memory are
associated with diferent programs.
In this work, we are interested in revisiting the use of coevolved
modular GP controllers for external memory in partially observ-
able tasks. Particular attention is given to the formulation of a list
data structure and a coevolutionary modular approach for combin-
ing the diferent GP memory controllers into a cohesive solution.
A benchmarking study is then performed over a set of scalable
sequence recall tasks as recently employed for the purpose of as-
sessing the efciency of neural memory models [7, 10]. We are able
to demonstrate general solutions to the sequence recall benchmarks
and also illustrate the role that the instruction set plays in biasing
the quality of solutions provided.
The balance of the paper begins by introducing the external
memory model and formulation adopted for GP (ğ2). Specifcally, we
assume canonical Tree structured GP as implemented in DEAP [3]
and emphasize how GP interfaces to a list data structure. Section 3
characterizes the scalable sequence recall benchmark as previously
proposed to assess the efectiveness of memory mechanisms in
neural networks [7, 10]. Section 4 presents the benchmarking study
while conclusions are drawn in Section 5.
2 EXTERNAL MEMORY MODEL AND GP
FORMULATION
We assume that GP will have the following general form:
• Canonical (tree structured) GP with an instruction set
composed from arithmetic and / or logical operations as im-
plemented in open source code distributions (DEAP assumed
in this work [3]). This also implies that selection, variation
and replacement operations are also generic. Naturally, this
implies that GP is purely reactive, i.e has no capacity for
recurrent behaviours itself.
• External memory provides the mechanism for recalling
previous state(s) and will be modelled as a list data structure.
GP will therefore have to learn how to apply the list to solve
memory tasks by choosing between one of A commands at
518