Developing Context-Free Grammars for Equation Discovery: An Application in Earthquake Engineering ˇ Stefan Markiˇ c and Vlado Stankovski ⋆ Abstract. In the machine-learning area of equation discovery (ED) context-free grammars (CFG) can be used to generate equation structures that best describe the dependencies in a given data set. Our goal is to investigate the possible strategies of incorporating domain knowledge into a CFG, and evaluate the effect on the obtained results in the ED process. As a case study, the Lagramge ED system is used to discover equations that predict the peak ground acceleration (PGA) in an earthquake event. Existing equations for PGA represent rich domain knowledge and are used to form three different CFGs. The obtained results demonstrate that the inclusion of domain knowledge in the CFG which is neither too general, neither too specific, may lead to new, high-precision equation models for PGA. Keywords: equation discovery, Lagramge, context-free grammar, domain knowl- edge, earthquake engineering, peak ground acceleration. 1 Introduction Equation discovery (ED) is a sub area of machine learning aiming at automatic induction of mathematical models expressed as equations. The goal is to find an equation structure from a given set of operators, functions and variables that repre- sents an appropriate model for the provided data set. ED systems like Lagramge 1 use the context-free grammar (CFG) formalism to restrict the hypothesis space of ˇ Stefan Markiˇ c · Vlado Stankovski Faculty of Civil and Geodetic Engineering, University of Ljubljana, Slovenia, Jamova cesta 2, SI-1000 Ljubljana e-mail: vlado.stankovski@fgg.uni-lj.si ⋆ Corresponding author. 1 The Lagramge release 2.2 used in this study is available as open-source software at URL: http://www-ai.ijs.si/∼ljupco/ed/lagrange.html (accessed 6 th February 2012) M. Ali et al. (Eds.): Contemporary Challenges & Solutions in Applied AI, SCI 489, pp. 197–203. DOI: 10.1007/978-3-319-00651-2_ 27 c Springer International Publishing Switzerland 2013