Automating Software Evolution David Hearnden, Paul Bailes School of ITEE University of Queensland Australia hearnden@itee.uq.edu.au Michael Lawley, Kerry Raymond Distributed Systems Technology Centre (DSTC) Australia Abstract Software maintenance and evolution are the most expen- sive activities in the software process, consuming 60%to 80% of the total time spent on a software system. However our understanding of maintenance activities has barely de- veloped beyond arbitrary change to arbitrary things. The standard categories of maintenance are based on subjective characteristics (purpose), rather than objective attributes. Only by understanding the relationships and dependencies between entities in the software process (such as specifica- tion, design and implementation) can we begin to objec- tively categorise and potentially automate aspects of soft- ware evolution. 1. Introduction 1.1. Evolution Many studies have shown that the maintenance phase constitutes 60% to 80% of the total effort spent on a soft- ware product [1, 5, 12]. Contemporary iterative methodolo- gies place even more importance on maintenance and evo- lution, to the point where software engineering is a process of continual evolution [13, 6]. 1.2. State of the problem Given the crucial role of evolution and maintenance in contemporary software engineering, we need mechanisms to increase the speed and the reliability (which will ulti- mately decrease the cost) of maintenance. To do this, we must strive to automate as many maintenance tasks as pos- sible. Ground-breaking work has been conducted on studying the effects of evolution (such as Lehman’s Laws of Software Evolution [10]), however there is still little coherency in evolution practice. It is hard to objectively classify software evolution as anything more detailed than arbitrary changes to arbitrary things. The problem with automating maintenance tasks is that in order to automate maintenance, we must first understand the mechanics of how maintenance affects the various arte- facts of a software system. However, the standard classifica- tions of maintenance types are based on human-centric se- mantics, such as purpose. The ISO/IEC Standard for Soft- ware Maintenance [8] and the IEEE Standard for Software Maintenance [7] define five types of maintenance activities between them: adaptive, corrective, perfective, preventive and emergency. The problem with this classification is that it is not based on objective characteristics of the changes that are made, but rather the purpose of the change. This deficiency has been noted in [9]. One way we can add more useful information into our description of maintenance tasks is to realise that some changes are caused by other changes, and furthermore, some changes are determined by other changes. Our intu- itive approach to automation is to specify causal or root in- formation and to expect machinery to compute consequent information. If we are able to analyse a set of changes and identify what the causal changes are, we can begin to see that set of changes in a more structured way, allowing us to better understand those maintenance tasks. For exam- ple, we may classify implementation-level changes as being caused by design-level changes. Furthermore, if we are able to identify dependencies between software artefacts that we wish to maintain, then not only can we identify causal re- lationships between changes to those artefacts, but we may also determine one change from another. Thus we are in- terested in dependency relationships between artefacts in order to identify where a set of root changes may induce other changes. We are also interested in those dependencies which are functional, meaning that the induced changes are not only caused by determined by root changes. What we see when we start to explore the world of au-