More Success and Failure Factors in Software Reuse Tim Menzies and Justin S. Di Stefano Abstract—Numerous discrepancies exist between expert opinion and empirical data reported in Morisio et al.’s recent TSE article. The differences related to what factors encouraged successful reuse in software organizations. This note describes how those differences were detected and comments on their methodological implications. Index Terms—Reuse, machine learning. æ 1 INTRODUCTION IN the April 2002 IEEE Transactions on Software Engineering article “Success and Failure Factors in Software Reuse” [1], Morisio et al. sought key factors that predicted for successful software reuse. Their data came from a set of structured interviews conducted with project managers of 24 European projects from 19 companies in the period 1994 to 1997. Those projects were trying to achieve company-wide reuse of between one to a hundred assets. Nine of those 24 projects were judged by their respective managers as failures. Morisio et al. employed a well-designed interview process to collect a wide set of project attributes (for a complete listing of those attributes, see the Appendix). There is much that is exemplary in the approach taken by Morisio et al. For example, their data collection method is well- documented. Also, an extensive manual analysis of their data is presented in the paper, including a full discussion of all nine failing reuse projects. Section 6 of that paper, A Reuse Introduction Decision Sequence, offers a detailed set of recommendations for organizations seeking to create reusable assets. Their related work section takes care to contrast their results with other researchers. An appendix to the paper shows a clustering analysis of the projects and the decision tree of Fig. 1 that Morisio et al. argue represents the two major predictors for reuse. Another exemplary feature of the Morisio et al. study was that they presented their entire data set in their article. The inclusion of this data set allows other researchers to check their conclusions. When we checked their conclusions using several data miners, we found patterns that disagree with the decision sequence described in Section 6 of Morisio et al.’s paper. These differences are summarized in Fig. 2. Before focusing on those disagreements, it is important to stress that, in many aspects, we agree with Morisio et al. For example, Fig. 3 shows many attributes for which neither Morisio et al. nor ourselves could find evidence that they predicted for successful reuse (see the entries marked with a “”). For example, one of these “no evidence” attributes was use of Development Approach ¼ OO. We completely endorse Morisio et al.’s point that, e.g., switching to C++ is insufficient to guarantee a successful reuse project. As to the other “no evidence attributes,” our studies don’t say they don’t matter: only that they did not appear to matter in the projects sampled by Morisio et al. Fig. 3 also shows other attributes that both our studies report predict for successful reuse. However, we could only find barely supportive or very weak supportive for some of those attributes (barely supportive and very weak support are defined below). 2 DATA MINERS Having described where our studies agreed, we now describe where the application of three data miners caused us to disagree with the conclusions of Morisio et al. To do that, we must first describe our data miners. The goal of data mining is to find important patterns in data sets. Analyzing these data sets by hand is problematic at best and can take substantial time and effort. It is both quicker and easier if a computer can be “taught” to search for these patterns. In the 21st century, data mining is a very mature field. Many powerful mining tools are freely available via the World Wide Web. This study applied three such mining tools to the Morisio et al. data: the APRIORI association rule learner [3]; the J4.8 decision tree learner [4]; and the TAR2 treatment learner [5]. Our implementations of APRIORI and J4.8 come from the WEKA toolkit [4] 1 while TAR2 came from the treatment learning download page. 2 The essential details of these tools are summarized below. Decision tree learners find mappings between classes and nonclass attributes. The class attributes in the Morisio et al. data set were successful reuse and failed reuse while the nonclass attributes are shown in the Appendix. Fig. 1 shows one example of such a mapping between class and nonclass attributes. Note that, of the nearly two dozen nonclass attributes collected by Morisio et al., only two appear in the decision tree. Decision tree learners seek the most informative attribute ranges that splits the training data into subsets with similar classes. The process repeats recursively for each subset and returns one subtree for each recursive call. Different decision tree learners use different criteria for splitting the training sets. The CART algorithm [2] used by Morisio et al. uses the GINA index. In our study, we used J4.8 [4] which is the JAVA variant of C4.5 [6] that comes with the WEKA. C4.5 uses a splitting criteria based on information theory. Decision tree learning can also be used to determine which attributes are most important using an attribute removal experiment. Decision trees have a root node which mentions the attribute range most useful in splitting the training data. If that attribute is removed from the training set and the learner is run again, then the root node seen in the new tree contains the next most important attribute. The results of attribute removal experiments on the Morisio et al. data is shown in Fig. 4: . We say that an attribute is barely supportive if it is removed in the next attribute removal experiment but the classifica- tion accuracy does not change. Fig. 4 shows that Reuse Processes Introduced and Top Management Commitment are barely supportive attributes. . We say that an attribute is very weak if it first appears as a nonroot node of a decision tree that is learnt very late in an attribute removal experiment, i.e., only after many more supportive attributes have been removed. Key Reuse Roles Introduced and Non-Reuse Processes Modified are very weak attributes since they only appeared in J4.8’s decision trees after Experiment 4 of Fig. 4. Association rule learners find attributes that commonly occur together in a training set. In the association LHS¼)RHS, no attribute can appear on both sides of the association, i.e., LHS \ RHS ¼;. The rule LHS¼)RHS holds in the example set with confidence c if c% of the examples that contain LHS also 474 IEEE TRANSACTIONS ON SOFTWARE ENGINEERING, VOL. 29, NO. 5, MAY 2003 . The authors are with the Lane Department of Computer Science, West Virginia University, PO Box 6109, Morgantown, WV 26506-6109. E-mail: tim@menzies.com, justin@lostportal.net. Manuscript received 11 Sept. 2002; accepted 10 Dec. 2002. Recommended for acceptance by S. Pfleeger. For information on obtaining reprints of this article, please send e-mail to: tse@computer.org, and reference IEEECS Log Number 117409. 1. http://www.cs.waikato.ac.nz/ml/weka/. 2. http://www.ece.ubc.ca/twiki/bin/view/Softeng/TreatmentLearner. 0098-5589/03/$17.00 ß 2003 IEEE Published by the IEEE Computer Society