Graph Design by Graph Grammar Evolution Martin H. Luerssen, Member, IEEE, and David M. W. Powers, Senior Member, IEEE Abstract— Determining the optimal topology of a graph is pertinent to many domains, as graphs can be used to model a variety of systems. Evolutionary algorithms constitute a popular optimization method, but scalability is a concern with larger graph designs. Generative representation schemes, often inspired by biological development, seek to address this by facilitating the discovery and reuse of design dependencies and allowing for adaptable exploration strategies. We present a novel developmental method for optimizing graphs that is based on the notion of directly evolving a hypergraph grammar from which a population of graphs can be derived. A multi-objective design system is established and evaluated on problems from three domains: symbolic regression, circuit design, and neural control. The observed performance compares favorably with existing methods, and extensive reuse of subgraphs contributes to the efﬁcient representation of solutions. Constraints can also be placed on the type of explored graph spaces, ranging from tree to pseudograph. We show that more compact solutions are attainable in less constrained spaces, although convergence typically improves with more constrained designs. I. I NTRODUCTION Natural and artiﬁcial instances of systems that can be represented as graphs are ubiquitous and many problems of practical interest may be formulated as questions about graphs. While a variety of graphs are the product of self- organization, other graphs, such as the circuit of a micro- processor, require to be designed. With competent human designers an ever scarce resource, automatic design of graphs is therefore eminently useful. Evolutionary algorithms (EAs) are a class of heuristic optimization algorithms that have been applied to various problems, including design. However, they often scale poorly with the combinatorial explosion of conﬁgurations that exist for large graphs. Yet a large graph is not necessarily complex, and this is where self-organization can beneﬁt even the designer. A few simple rules can describe a huge graph if it exhibits some form of regularity. A precedent exists in biological development, where genes (the rules) are expressed into a complex organism (the graph). This paper begins with a general review of the evolu- tionary optimization of graph designs and, in particular, the application of developmental methods in this context. We then introduce a simple approach adapted from the formal technique of hyperedge replacement and combine it with a novel algorithm for grammar evolution to produce a design system titled G/GRADE (Graph GRAmmar Design by Evo- lution), which can capture patterns in evolved graph designs and facilitate their reuse in new solution candidates. As this This work was in part supported by the 2006/2007 Flinders University Faculty of Science and Engineering Program Grant. The authors are with the Artiﬁcial Intelligence Laboratory, School of Informatics and Engineering, Flinders University of South Australia, Bedford Park SA 5042, Australia (email: martin.luerssen@ﬂinders.edu.au; powers@ieee.org). constitutes a notable step beyond the existing emphasis in EAs on string and tree data structures, we explore the beneﬁts and drawbacks of generalizing to graphs and compare results to other established techniques. II. BACKGROUND A directed graph is a quadruple (V,E,s,t) where V is a ﬁnite set of vertices, E is a ﬁnite set of edges, and s, t : E → V assign a source s(e) and a target t(e) to each e ∈ E. Pseudographs are graphs that exhibit loops joining a vertex to itself or multiple edges connecting the same pair of vertices. We will refer to solution candidates as networks, independent of whether they describe simple graphs or pseudographs, except when the distinction is rel- evant. The most straightforward representation of a network is to directly encode it as an adjacency matrix, the rows of which can be concatenated into a string for optimization by a genetic algorithm. As the string scales with the size of the network rather than its complexity, however, large networks become difﬁcult to optimize even if they exhibit symmetry – a property common to many useful designs. A. Biological Embryogeny Biological designs exploit symmetries by employing a generative, highly indirect mapping between the evolved (genotype) and evaluated (phenotype) representations. The developmental process that mediates this, commonly also referred to as an embryogeny [1], is characterized by polygeny (multiple genes deﬁne a single phenotypic variable) and pleiotropy (changes to a single gene affect multiple phenotypic variables), which respectively facilitate the neu- trality and modularity of design. Neutrality is deﬁned by genotypic variations that fail to affect the phenotype, which has implications for the evolution of evolvability, an effect known as canalization [2]. Canalization is a form of genetic buffering which affects the exploration strategy of evolution by reducing the impact of new mutations and thus allowing a build-up of hidden genetic variation. A change in the selection objective or further variation may break down the canalizing system and lead to more rapid directional change than would otherwise be expected to occur. Neutral variations therefore allow distinct exploration strategies to be encoded in – and ultimately evolved with – the genotype [3]. In contrast, modularity concerns the effective partition of sets into distinct subsets that can be optimized independently [4]. Network designs may be encoded efﬁciently in terms of modules, thus reducing the dimensionality of the conﬁgu- ration space that must be searched. In conjunction, neutrality and modularity contribute to an adaptive evolutionary process that we know to scale favorably with a variety of challenges. Copyright © 2007 IEEE. Reprinted from the Proceedings of the IEEE Congress on Evolutionary Computation, Singapore, September 25–28, 2007.