On Feature Traceability in Object Oriented Programs Giuliano Antoniol, Ettore Merlo Computer Engineering Department ´ Ecole Polytechnique Montreal, Canada antoniol@ieee.org, ettore.merlo@polymtl.ca Yann-Ga¨ el Gu ´ eh´ eneuc, Houari Sahraoui GEODES – Group of Open and Distributed Systems, Experimental Software Engineering DIRO, University of Montreal, Canada guehene@iro.umontreal.ca, sahraouh@iro.umontreal.ca ABSTRACT Open-source and industrial software systems often lack up- to-date documents on the implementation of user-observable functionalities. This lack of documents is particularly hin- dering for large systems. Moreover, as with any other soft- ware artifacts, user-observable functionalities evolve through software evolution activities. Evolution activities sometimes have undesired and unexpected side-effects on other func- tionalities, causing these to fail or to malfunction. In this position paper, we promote the idea that a traceability link between user-observable functionalities and constituents of a software architecture (classes, methods. . . implementing the functionalities) is essential to reduce the software evolution effort. We outline an approach to recover and to study the evolution of features—subsets of the constituents of a soft- ware architecture—responsible for a functionality. Keywords Feature traceability during evolution. 1. INTRODUCTION Evolution of implementation and evolution of functional- ities characterises the life of any software system. Success- ful systems operate for decades and often outlive the hard- ware and operational environments for which they were con- ceived, designed, and developed originally. Source code of industrial systems often evolves without the documentation being updated because maintaining consistency and trace- ability between high-level abstractions, functionalities, and software constituents is costly and time-consuming. Docu- mentation updates are also frequently neglected due to time and evolution pressure. High-level documentation, such as requirement or design documents, is often absent in open- source systems and no effort is pursued to provide traceabil- ity information. Yet, open-source systems are now common, e.g., most ADSL routers, modems, and fire-walls, run cus- tomised versions of the open-source Linux operating system. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. TEFSE’05, November 8 th , 2005, Long Beach, CA, USA. Copyright 2005 ACM X-XXXXX-XX-X/XX/XX ...$5.00. Meanwhile, software evolution involves costly and tedious program comprehension activities to identify and to under- stand data structures, functions, methods, objects, classes, and—more generally—any high-level abstractions required by maintainers to perform the evolution. As of the year 2005, source code browsing is required during software evo- lution (and maintenance) because missing or obsolete doc- umentation leads maintainers to rely on source code only. Source code browsing is resource consuming as the size and the complexity of systems increase. An alternative to source code browsing is the automated recovery of higher-level abstractions beyond those obtained by examining the system itself [7], such as program features. We define a program feature as a micro-architecture, which is a subset of a program architecture grouping data struc- tures, fields, classes, functions, and methods participating in the realisation of a user-observable functionality in a given scenario. The scenario details the conditions and steps of realisation of the functionality. For example, in a ADSL router, setting the user’s name and password corresponds to one feature, as is adding a new fire-wall rule. In this position paper, we support the idea to recover pro- gram features automatically, i.e., to build traceability links between source code and user-observable functionalities, and to maintain traceability links among subsequent releases of a same feature as well as among different features of a given release. We define a traceability link as an association be- tween a micro-architecture and a user-observable function- ality. A traceability link can be used to highlight differences among features in a release or among releases for a given feature. Indeed, information on a feature evolution, along with the rationale for evolution (bug lists, user requests), is essential to identify fault-prone features. Information on fea- ture interactions is essential to avoid undesired side-effects on unmodified features during evolution. Recovering feature, i.e., identifying micro-architectures responsible for a functionality, and maintaining traceability links among releases and among features require the devel- opment of several technologies. We need to resort to: • Static and dynamic analyses. • 2D and 3D visualisation. • Information retrieval and computational linguistics. Recently, several authors (see, for example, [3, 8]) ad- dressed the problem of identifying feature in object-oriented systems. We concur with previous contributions that these