A Comparative Study of Genetic Programming and Grammatical Evolution for Evolving Data Structures Kevin Igwe School of Mathematics, Statistics and Computer Science University of KwaZulu-Natal Pietermaritzburg, South Africa igwekevin@gmail.com Nelishia Pillay School of Mathematics, Statistics and Computer Science University of KwaZulu-Natal Pietermaritzburg, South Africa pillayn32@ukzn.ac.za Abstract—The research presented in the paper forms part of a larger initiative aimed at automatic algorithm induction using machine learning. This paper compares the performance of two machine learning techniques, namely, genetic programming and a variation of genetic programming, grammatical evolution, for automatic algorithm induction. The application domain used to evaluate both the approaches is the induction of data structure algorithms. Genetic programming is an evolutionary algorithm that searches a program space for an algorithm/program which when executed will provide a solution to the problem at hand. Grammatical evolution is a variation of genetic programming which provides a more flexible encoding, thereby eliminating the sufficiency and closure requirement imposed by genetic programming. The paper firstly extends previous work on genetic programming for evolving data structures, providing an alternative genetic programming solution to the problem. A grammatical evolution solution to the problem is then presented. This is the first application of grammatical evolution to this domain and for the simultaneous induction of algorithms. The performance of these approaches in inducing algorithms for the stack and queue data structures are compared. Keywords—algorithm induction; genetic programming; grammatical evolution; automatic programming I. INTRODUCTION The paper reports on a study that forms part of a project investigating automatic algorithm induction and design using machine learning. One of the areas researched as part of this project is automatic algorithm induction as a means of automatic programming. Genetic programming, a machine learning technique for solving optimization problems, appears to be apt for this purpose. Genetic programming searches a program space for a program, which when executed will produce a solution to the problem at hand[1]. Each program is generally represented as a parse tree. Genetic programming has been successfully applied to various domains including data mining, natural language processing, image processing and electronic circuit design [2]. There have been various attempts at using genetic programming for automatic programming. In [3] genetic programming is used to evolve algorithms according to the imperative programming paradigm, using memory, iteration and modularization. Algorithms are evolved in an internal representation language to facilitate language independence and can be converted into any procedural programming language. As the field of genetic programming advanced, researchers started looking to good programming practices to improve the scalability and problem solving ability of genetic programming. One such practice is object-oriented programming which led to the extension of genetic programming to object-oriented genetic programming (OOGP) [4-7]. OOGP evolves object-oriented programs. This work has essentially focused on the induction of algorithms for method implementation rather than the evolution of classes and interfaces. Methods for a class are generally evolved simultaneously. OOGP has also been used for purposes of automatic programming [8-10]. Bruce [8] compares the sequential and simultaneous induction of methods to evolve object-oriented programs. Each method is an automatically defined function [11] and all methods are stored in indexed memory. The proposed approach for OOGP is evaluated in the domain of data structure algorithm induction. A similar approach is taken by Langdon [9]. This study researches the induction of both methods for classes and programs using instances of the classes. The approach is also tested for the evolution of data structure algorithms as well as solution algorithms for problems requiring the use of the evolved data structures. In [10] a rule-based expert system is used to induce an object- oriented design (OOD) from a program specification. The OOD forms input to a genetic programming component which evolves the methods for the program sequentially, allowing function calls between methods. More recent studies in the area of OOGP include initial investigations into grammar-based genetic programming for the evolution of object-oriented programs [12] and a combination of OOGP and linear genetic programming [13]. Grammatical evolution is a variation of genetic programming which aims at providing a more flexible encoding of programs thereby allowing for programs to be generated in any language [14]. Grammatical evolution (GE) essentially evolves a population of binary strings which represent programs. The execution of a program involves converting the binary string into an integer which is then mapped onto a grammar, resulting in a production rule of the grammar being executed [14]. We hypothesize that grammatical evolution has the potential to contribute to the domain of automatic object oriented programming. To the authors' knowledge there has been no previous work into grammatical evolution for object-