Simple and Efficient Implementation of Pattern Matching in MOLA Tool Audris Kalnins, Edgars Celms, Agris Sostaks University of Latvia, IMCS 29 Raina boulevard, Riga, Latvia {Audris.Kalnins, Edgars.Celms}@mii.lu.lv, agree@os.lv Abstract - One of crucial problems for model transformation implementations is an efficient implementation of pattern matching. The paper addresses this problem for MOLA Tool implementing the model transformation language MOLA. Another goal has been to keep the implementation as simple as possible. The paper presents one possible solution to the combined problem where an SQL database with fixed schema is used as the MOLA runtime repository. A natural coding is selected where a MOLA pattern match can be mapped to a single non-standard self-join SQL query. The paper shows that a sufficient matching efficiency can be obtained this way. The generated queries are analyzed from the table join order point of view and it is shown that the default query optimization for the MySQL database can find an order close to optimal. This analysis and performed experiments are used to conclude that at this moment MySQL is the most fit for MOLA implementation among "free" relational databases. In addition, benchmark tests based on a simple natural model transformation problem are used to estimate efficiency of the selected implementation architecture and to compare MOLA Tool to the popular graph transformation tool AGG. Benchmark tests confirm the efficiency of the current MOLA Tool implementation and applicability of MOLA language to MDD-specific tasks. I. INTRODUCTION Nearly all of model transformation languages use the pattern matching as the main functional element for defining how the source model components must be transformed to the target model. So does the transformation language MOLA analyzed in this paper. When a transformation language is implemented, the implementation of pattern matching typically is the most demanding component to implement and also the key factor determining the implementation efficiency. This issue has been analyzed theoretically in various contexts. For MOLA, authors of this paper have already shown that a very efficient pattern matching implementation is possible in principle [1], however this implementation would require significant effort to build and therefore is appropriate only for an industrial tool. For other transformation languages, the most thorough analysis has been performed for the GReAT language [2]. In this paper, the problem appears in another setting. An academic model transformation tool supporting MOLA has been built using limited resources, and for this tool both simple and sufficiently efficient implementation has been required. Another related problem is the choice of runtime repository, since the pattern matching is very intimately related to repository access mechanisms. A standard choice, used in most academic model transformation tools [3,4,5] and some industrial ones [6,7] too, is a metamodel based repository, such as Eclipse EMF [8], MDR [9] or similar ones. These repositories typically have a low level universal API for retrieving class instances. This solution would make the implementation of pattern matching and other language features significantly more complicated. Several possible solutions for these two related problems in the context of MOLA tool have been analyzed. The final decision, which is described in this paper, occurred to be rather non-typical for model transformation tools – the best kind of repository would be a relational database with fixed schema – tables coding the metamodel and model in the most natural way. The central idea of this implementation is that a MOLA pattern match operation can be implemented by a single SQL query. And this query is easy to generate from the pattern definition. The only remaining problem is whether such a rather non- standard query (using multiple self-joins) can be processed efficiently by database engines. Analysis in the paper shows that not all engines perform efficiently enough, but there are freely available ones which can do this, currently the best one is MySQL. These results are in concordance with other papers analyzing usability of SQL for pattern matching [10,11] (however, a completely different database structure is used there and the experiment setting is also different). The paper describes the solution used in MOLA tool. After a brief reminder of MOLA language and an overview of the MOLA tool architecture, the core of the tool – the MOLA virtual machine (VM) is defined (section 5). The most appropriate database structure and the mapping of a pattern to an SQL query are described in detail in section 6. Section 7 analyzes the performance issues of generated queries, especially the table join order. Section 8 contains a benchmark test, which compares the transformation of simplified UML class diagram to simplified OWL diagram implemented both in MOLA Tool and the popular graph transformation tool AGG [12] on various model sizes. The results confirm the efficiency of MOLA implementation and its practical usability.