Mining Repositories to Assist in Project Planning and Resource Allocation Tim Menzies Department of Computer Science, Portland State University, Portland, Oregon tim@menzies.us Justin S. Di Stefano, Chris Cunanan, Robert (Mike) Chapman, Integrated Software Metrics Inc., Fairmont, West Virginia justin@lostportal.net, ccunanan@ismwv.com, Robert.M.Chapman@ivv.nasa.gov Abstract Software repositories plus defect logs are useful for learning defect detectors. Such defect detectors could be a useful resource allocation tool for software managers. One way to view our detec- tors is that they are a V&V tool for V&V; i.e. they can be used to assess if ”too much” of the testing budget is going to ”too little” of the system. Finding such detectors could become the business case that constructing a local repository is useful. Three counter arguments to such a proposal are (1) no gen- eral conclusions have been reported in any such repository despite years of effort; (2) if such general conclusions existed then there would be no need to build a local repository; (3) no such general conclusions will ever exist, according to many researchers. This article is a reply to these three arguments. To appear in the International Workshop on Mining Software Repositories (co-located with ICSE 2004) May 2004; http:// msr.uwaterloo.ca. 1 Introduction To make the most of finite resources, test engineers typically use their own expertise to separate critical from non-critical soft- ware components. The critical components are then allocated more of the testing budget than the rest of the system. A con- cern with this approach is that the wrong parts of the system might get the lions-share of the testing resource. Defect detectors based on static code measures of components in repositories are a fast way of surveying the supposedly non- mission-critical sections. Such detectors can be a V&V tool for V&V; i.e. they can be used to assess if too much of the testing budget is going to too little of the system. As shown below, sat- isfactory detectors can be learnt from simple static code measures based on the Halstead [2] and Mccabes [3] features 1 . Such mea- sures are rapid and simple to collect from source code. Further, 1 Elsewhere, we summarize those metrics [4]. Here we just say that Halstead measures reflect the density of the vocabulary of a function while Mccabe measures reflect the density of pathways between terms in the vocabulary. the detectors learnt from these measures are easy to use. Our experience with detect detectors has been very positive. Hence, we argue that organizations should routinely build and maintain repositories of code and defect logs. When we do so, we often hear certain objections to creating such repositories. This paper is our reply to three commonly-heard objections. For space reasons, the discussion here is brief. For full details, see [5, 6]. The first objection concerns a lack of external validity. De- spite years of research in this area, there has yet to emerge standard static code defect detectors with any demonstrable external valid- ity (i.e. applicable in more than just the domain used to develop them). Worse still, many publications argue that building detectors from static code measures is a very foolish endeavor [1, 7]. To counter the first argument, there has to be some demon- stration from somewhere that at least once, another organization benefited from collecting such an endeavor. Paradoxically, mak- ing such a demonstration raises a second objection against local repository construction. If detectors are externally valid then or- ganizations don’t need new data. Rather, they can just import data from elsewhere. To refute this buy not build objection, it must be shown that detectors built from local data are better than detectors built from imported data. Finally, if the proposal to build a repository survives objections one on two, then a third objection remains. Why is it that we make such an argument now when so many others have previously argued the opposite for so long? That is, it must be explained the source of opposition to static defect detectors. The rest of this paper addresses these objections using the NASA case study described in the next section. Using that data, we show that external valid detectors can be generated. Next, we show that these detectors can be greatly improved using detectors tuned to a local project. Finally, we identity potential sources of systematic errors that may have resulted in prior negative reports about the merits of static code defect detectors. 2 Case Study Material Our case study material comes from data freely available to other researchers via the web interface to NASA’s Metrics Data Program (MDP) (see Figure 1). MDP contains around two dozen