Causes of Architecture Changes: An Empirical Study through the Communication in OSS Mailing Lists Wei Ding 1,4 , Peng Liang 1* , Antony Tang 2 , Hans van Vliet 3 1 State Key Lab of Software Engineering, School of Computer, Wuhan University, China 2 Faculty of Science, Engineering and Technology, Swinburne University of Technology, Australia 3 Department of Computer Science, VU University Amsterdam, The Netherlands 4 Key Laboratory of Earthquake Geodesy, Institute of Seismology, China Earthquake Administration, China tingwhere@whu.edu.cn, liangp@whu.edu.cn, atang@swin.edu.au, hans@cs.vu.nl Abstract—Understanding the causes of architecture changes allows us to devise means to prevent architecture knowledge vaporization and architecture degeneration. But the causes are not always known, especially in open source software (OSS) development. This makes it very hard to understand the underlying reasons for the architecture changes and design appropriate modifications. Architecture information is communicated in development mailing lists of OSS projects. To explore the possibility of identifying and understanding the causes of architecture changes, we conducted an empirical study to analyze architecture information (i.e., architectural threads) communicated in the development mailing lists of two popular OSS projects: Hibernate and ArgoUML, verified architecture changes with source code, and identified the causes of architecture changes from the communicated architecture information. The main findings of this study are: (1) architecture information communicated in OSS mailing lists does lead to architecture changes in code; (2) the major cause for architecture changes in both Hibernate and ArgoUML is preventative changes. (3) more than 45% of architecture changes in both projects happened before the first stable version was released, which indicates that the architectures of the investigated OSS projects are relatively stable after the first stable release. Keywords-architecture change; cause of change; open source software; mailing list; communication I. INTRODUCTION Software architecture (SA) represents “the fundamental concepts or properties of a system in its environment embodied in its elements, relationships, and in the principles of its design and evolution” [1]. Systems continuously evolve and change to be adapted to new uses, just as buildings change over time [2], which consequently leads to architecture changes. Understanding the causes of architecture changes is important to help practitioners to understand the knowledge of the design decisions that lead to the architecture changes [3], and also allows researchers to devise means to prevent architecture knowledge vaporization and architecture degeneration [4]. The causes of architecture changes are regarded as an essential element of architectural design decision, which is a first-class entity to represent architecture [5], and are used to develop related methods to deal with specific architecture changes, for example, architects analyze due to what cause the property of an architecture is inhibited in order to transform the architecture to satisfy non-functional requirements [6]; architectural styles as analysis tools are used to analyze the causes of architecture changes, and in turn to predict the effect of the architecture changes [7]. Architectural knowledge vaporization (e.g., design decisions and causes of architecture changes) will lead to increased maintenance costs [5]. To prevent this problem, developers (especially architects) need a way to record and communicate the causes of changes in architecture. With an explicit description of architecture as well as their changes [8], software maintainers can better understand the ramification of architecture changes and thereby more accurately analyze the impact and estimate costs of modifications [9]. But the reality is that the rationale of architectural design decisions (e.g., their causes) is often not available in SA documentation [10], especially in OSS development when SA is rarely documented (only 5.4% of 2000 investigated OSS projects have some SA documentation) [11]. We conjecture that causes of architecture changes are communicated between developers through various media, especially in a distributed development context when face-to-face communication is difficult. Mailing list is an important social media for knowledge sharing between knowledge providers and knowledge seekers in OSS projects [12]. Our recent study has shown that communication on architecture does exist in the mailing lists of two popular OSS projects (Hibernate and ArgoUML) [13], and OSS development mailing lists may act as a potential source to extract and identify the cause information of architecture changes in a project. One of the characteristics of many successful OSS projects is the existence of a SA [14]. Architecture change is also a widespread phenomenon in OSS development, for example, an investigation of the changes in Linux kernel’s evolution indicates that most remarkable growth for a “stable” version has been in the addition of new features and support for new architectures rather than fixing defects [15]. To understand the causes of architecture changes [16][17], we conducted an empirical study to extract, identify, and analyze the architecture change information communicated in the OSS mailing lists of two popular OSS projects: Hibernate and ArgoUML based on the data (i.e., architectural threads, which are a set of communication posts on the same topic that contain architecture information in mailing lists) we collected in [13]. The identified architecture changes in * Corresponding author This work is sponsored by the NSFC under Grant No. 61170025, 61472286. (DOI reference number: 10.18293/SEKE2015-193)