Building a Family of Compilers Wonseok Chae and Matthias Blume Toyota Technological Institute at Chicago {wchae, blume}@tti-c.org Abstract We have developed and maintained a set of closely related compilers. Although much of their code is duplicated and shared, they have been maintained separately because they are treated as different compilers. Even if they were merged together, the combined code would become too complicated to serve as the base for another extension. We describe our experience to address this problem by adopting the product line engineering paradigm to build a family of compilers. This paradigm encourages developers to focus on developing a set of compilers rather than on developing one particular compiler. We show engineering activities for a family of compilers from product line analysis through product line architecture design to product line component design. Then, we present how to build particular compilers from core assets resulting from the previous activities and how to take advantage of modern programming language technology to organize this task. Our experience demonstrates that the product line engineering as a developing paradigm can ease the construction of a family of compilers. 1. Introduction When a new extension is added to a base compiler, we often copy the source code of the base compiler and edit it. This easy approach results in two different compilers which require to be maintained differently even though much of their code is duplicated and should be shared. Even if we are lucky and manage to merge new extensions back to the base compiler to have only one code base, they would quickly increase the overall complexity and make it harder to serve as base for further extensions. This is what we have experienced in the MLPolyR compiler project [1]. From the beginning, the MLPolyR compiler was expected to serve as foundation for teaching and for programming language research by providing the basic infrastructure. Indeed it has served as the baseline for new interesting developments [2][3]. As the compiler grew larger, however, accumulation of these experiments has weakened its original purpose as the starting point of new research. One possible way of addressing this problem is to adopt the product line engineering paradigm. Product line engineering is a paradigm of developing a family of products [4][5][6]. In this emerging paradigm, a software product line is a set of software systems that share a common set of features with variations. Therefore, they are expected to be developed from a common set of software components (called core assets) on the same software architecture. This paradigm encourages developers to focus on developing a set of products rather than on developing one particular product. Then, products are built from core assets rather than from scratch. Among various product line approaches, we adopt the feature-oriented product line engineering for the following reasons. First, feature-oriented product line engineering provides adequate means to reason about commonality and variability in terms of features [5]. Features express commonality and variability among products enough to define differences (i.e., extension) between compilers. Moreover, aspects of a compiler can be investigated more deeply by comparing with similar compilers than in isolation. Second, it promotes architecture-based development [4], which fits well to a compiler system because almost all compilers have a pipe-and-filter architectural style, that is, a compiler is organized to have several batch sequential phases from parsing to generating code [19]. Each phase can be implemented individually as a component. Therefore, a set of compilers can be obtained by adding new components or replacing an existing component from the reference architecture. It will increase understandability and reusability. In this paper, we describe our experience of building a family of compilers based on product line engineering. After brief introduction to the MLPolyR 12th International Software Product Line Conference 978-0-7695-3303-2/08 $25.00 © 2008 IEEE DOI 10.1109/SPLC.2008.28 307 12th International Software Product Line Conference 978-0-7695-3303-2/08 $25.00 © 2008 IEEE DOI 10.1109/SPLC.2008.28 307