Mining Design Patterns from C++ Source Code Zsolt Balanyi and Rudolf Ferenc Research Group on Artificial Intelligence, University of Szeged, Hungary zsoca@rgai.inf.u-szeged.hu, ferenc@cc.u-szeged.hu Abstract Design patterns are micro architectures that have proved to be reliable, easy-to implement and robust. There is a need in science and industry for recognizing these patterns. We present a new method for discovering design patterns in the source code. This method provides a precise specifica- tion of how the patterns work by describing basic structural information like inheritance, composition, aggregation and association, and as an indispensable part, by defining call delegation, object creation and operation overriding. We introduce a new XML–based language, the Design Pattern Markup Language (DPML), which provides an easy way for the users to modify pattern descriptions to suit their needs, or even to define their own patterns or just classes in cer- tain relations they wish to find. We tested our method on four open-source systems, and found it effective in discov- ering design pattern instances. Keywords Design Patterns, DPML, C++, UML, ASG, Schema, Columbus 1 Introduction Design patterns [9] are micro architectures that have proved to be reliable, easy-to implement and robust. Hence they can be a measure of the quality of an object oriented software system. So a software system can be characterized among other things by the number of the design patterns used. Of course, one must fully understand the design pat- terns one would like to use because improperly used, they can result in unnecessarily huge class structures that can in the worst case even decrease the quality of the code. There are three kinds of design patterns: Creational – these patterns are concerned with the cre- ation of objects. They can decide the type of the object and its multiplicity. Here it is not enough to match the pattern structure because the real functionality is hidden in the function implementations. Though it is difficult to recognize these patterns it is fortunately not impossible because object creations can be identified in the source code. Structural – these patterns deal with the composition of classes or objects. They define class hierarchies and different relations. In these patterns most features are described with the declarations of the operations and attributes, so they are easier to recognize than the cre- ational ones. Behavioral – these patterns describe how classes inter- act and distribute responsibility. Hence the behavior is defined in the bodies of the operations, the knowledge of the declarations is insufficient. This makes these patterns the most difficult to recognize. The recognition of design patterns is a crucial question in reverse engineering, since they represent a high level of ab- straction in OO design. As mentioned above, one possible usage might be in measuring the quality of a software sys- tem. This can help the project managers to decide whether a code is good enough to be used in the project. Good enough means that the code should be readily understandable, and it should be easy to modify parts of the code without need- ing to modify the whole code. So if the design patterns are well documented, it should be much easier to under- stand the source, and to make appropriate modifications on a well-defined part of it. Another possible usage is in helping documenting a source without proper comments on patterns for gaining ad- vantages described above. Well-commented program code is much easier to maintain than the one without comments or with poor comments. We can find pattern instances and this way help inserting comments where it is necessary. Yet another possible usage is in forward engineering, when the