7 TH WORKSHOP ON QUANTITATIVE APPROACHES IN OBJECT-ORIENTED SOFTWARE ENGINEERING (QAOOSE'2003) 1 Analogy-based software quality prediction David Grosser, Houari A. Sahraoui and Petko Valtchev Abstract — Predicting the stability of object-oriented systems is an important and challenging task. Classical approaches to quality prediction perform some form of inductive inference starting from datasets of software items with known quality factor values and looking for typical features that discriminate the items regarding the quality factor. However, most of the effective methods for predictive model construction are based on the implicit hypothesis that the available samples are representative, which is rather strong. The approach we propose implements a similarity-based comparison principle: the quality factor (stability) of a given software item is estimated from the recorded stability of a set of other items that have been recognized as the most similar to that item among a larger set of items stored in a database. This approach is evaluated using the successive versions of the JDK API. Index Terms—OO quality prediction, Case-based reasoning, OO class stability. —————————— u —————————— 1 POSITION redicting the stability (i.e. the capability of a software system or item to evolve while preserving its design ) of object-oriented (OO) is an important and challenging task. Classical approaches to quality prediction perform some form of inductive inference starting from datasets of software items with known quality factor values and look- ing for typical features that discriminate the items regard- ing the quality factor. However, most of the effective meth- ods for predictive model construction are based on the im- plicit hypothesis that the available samples are representa- tive, which is rather strong. In fact, unlike other experimen- tal fields within disciplines such as medicine, sociology, or statistics, where free access to large repositories of consen- sual data is granted, in software engineering there are no such sources of data. To make the matters worse, the nature of software and of the related software process makes the constitution of such consensual testbeds unrealistic. Our position is that an alternative approach for building predictive models which must be more appropriate to the particular context of software is necessary. The approach we propose implements a similarity-based comparison principle: the quality factor (stability) of a given software item is estimated from the recorded stability of a set of other items that have been recognized as the most similar to that item among a larger set of items stored in a database. In the context of our investigation, we restrict the defini- tion of the stability to the preservation of class interfaces, i.e., set of attributes and methods. This choice was moti- vated by the observation that any change in class interfaces can trigger a large amount of side effects due to the de- pendencies (coupling) between classes. The paper is organized as follows. First, the notion of stability in the context of OO paradigm is discussed in sec- tion 2. In section 3, we present briefly a framework for sta- bility measurement and prediction. Then, our approach is described in section 4. Section 5 follows the experimental evaluation of the similarity-based approach. 2 STABILITY ISSUES All its life-cycle along, but especially during operation time, a software undergoes various changes, most of the time triggered by error detection or environment changes, but also due to evolution in the requirements. As a result, the behavior of the software is gradually deteriorating along the increase in modifications and this quality slump may go as far as the entire software becoming unpredict- able [10]. Consequently, we claim that the software that is intended to last must be designed in a way that helps them withstand such negative impact, i.e., remain stable in spite of requirements evolution. Unfortunately, as reported in [6], the relative awareness about this important topic within the community, has not yet led to the broad adoption of stability-oriented design methods. Much work has been dedicated to the clarification of the stability concept. For instance, in [9], design rules are pro- posed which ensure the stability of large software systems by rigorous dependency management and abstraction. Similarly, [6] describes a design model that distinguishes between a kernel layer ( Enduring Business Themes and Busi- ness Objects) and peripheral layer ( Industrial Objects) in the system whereby the former attracts the major part of the stability enforcement effort. Indeed, it is suggested to keep the kernel stable while keeping the peripheral parts open to arbitrary changes. However, neither of the above studies suggests an effective method for stability evaluation, whence their respective impact on stability of the target systems is hard to estimate. Other researchers have put the emphasis on stability evaluation and proposed effective methods for the problem [2], [5], although the scope of these methods remains limited to software frameworks. Another way to achieve the stability of software systems ———————————————— • D. Grosser is with the IREMIA, University of Réunion Island, Réunion Island, France. E-mail: David.Grosser@univ-reunion.fr. This work was done when this author was in the Department of Computer Science and Oprations research, University of Montreal. • H.A. Sahraoui is with the Department of Computer Science and Oprations research, University of Montreal, Montreal, PQ, Canada, E-mail: sahraouh@iro.umontreal.ca. • P. Valtchev is with the Department of Computer Science and Oprations research, University of Montreal, Montreal, PQ, Canada, E-mail: valtchev@iro.umontreal.ca. P