Recommending Source Code for Use in Rapid Software Prototypes Collin McMillan College of William and Mary Williamsburg, VA 23185 cmc@cs.wm.edu Negar Hariri DePaul University Chicago, IL 60604 nhariri@cs.depaul.edu Denys Poshyvanyk College of William and Mary Williamsburg, VA 23185 denys@cs.wm.edu Jane Cleland-Huang and Bamshad Mobasher DePaul University Chicago, IL 60604 {jhuang,mobasher}@cs.depaul.edu Abstract—Rapid prototypes are often developed early in the software development process in order to help project stakeholders explore ideas for possible features, and to discover, analyze, and specify requirements for the project. As prototypes are typically thrown-away following the initial analysis phase, it is imperative for them to be created quickly with little cost and effort. Tool support for finding and reusing components from open-source repositories offers a major opportunity to reduce this manual effort. In this paper, we present a system for rapid prototyping that facilitates software reuse by mining feature descriptions and source code from open-source repositories. Our system identifies and recommends features and associated source code modules that are relevant to the software product under development. The modules are selected such that they implement as many of the desired features as possible while exhibiting the lowest possible levels of external coupling. We conducted a user study to evaluate our approach and the results indicated that our proposed system returned packages that implemented more features and were considered more relevant than the state-of-the-art approach. Keywords-software prototyping; domain analysis; recom- mender systems I. I NTRODUCTION Rapid prototyping is a software development activity in which programmers build a prototype of a software product by iteratively proposing, reviewing, and demonstrating the features of that product [22]. It is designed to help project stakeholders explore the features they would like to include in a product, and to interact with the prototype in order to discover and specify requirements. As prototypes are generally thrown-away, they must be built quickly and inexpensively, and must provide the flexibility to easily add or remove features. Other factors, such as efficiency or portability, are less important as the prototype may not even share the same programming language or hardware platform as the final product [22]. Therefore, it is essential to minimize the manual effort involved in building prototypes, and to maximize automation and source code reuse. As such, tool support for automatically locating and reusing features from open-source repositories offers a tremendous opportunity for reducing this manual effort [22]. Rapid prototyping is often divided into a horizontal and a vertical phase [25]. In the horizontal phase, domain analysts identify an initial set of candidate features for implementa- tion in the product. These features, which are often cursorily defined, are presented to the stakeholders for discussion, feedback, and refinement. This activity is often supported by domain analysis tools and techniques which identify features that are common across similar or competitive software systems [12], [15], [11]. However, such approaches provide only limited information about the implementation of those features. In contrast, during the vertical phase of rapid prototyping, developers build full functionality for a selection of features identified during the horizontal phase. This provides a much richer user experience, in which project stakeholders can run the software and interact with the features in order to decide on specific use cases and to identify potential problems. To reduce programming effort and shorten time-to-market, programmers can find and reuse existing solutions for their prototypes. Source code search engines have been developed to locate implementations that are highly-relevant to a fea- ture specified by a programmer (e.g., via a natural-language query) [20], [24]. However, although these engines are effec- tive for locating single features, they are not designed for the more complex, yet common case, in which a prototype will incorporate a set of interacting features. As a result, existing search engines often return packages that match only a small subset of the desired features, and developers have to invest considerable effort to integrate features from several different packages and projects. Under these circumstances, the cost and effort required for a programmer to comprehend and integrate the returned source code can significantly reduce the benefits of reuse [16]. In this paper we present a novel recommender system for supporting rapid prototyping. Our system directly addresses several shortcomings of existing techniques and tools, by integrating the horizontal and vertical phases of rapid pro- totyping. Our approach first recommends features, and then locates and recommends relevant source code. We utilize a hybrid set of algorithms based on PageRank [17], set coverage, and Coupling Between Objects (CBO) [9] in order to maximize the coverage of features while proposing a set of packages that minimize the integration effort involved in building a prototype. We implemented the recommender system and have con- ducted a cross-validation user study with 31 participants to compare the effectiveness of our approach against that of