Combining Rule-based and Information Retrieval Techniques to assign Software Change Requests Yguaratã Cerqueira Cavalcanti Brazilian Federal Data Processing Service and Federal University of Pernambuco Center for Informatics ycc@cin.ufpe.br Ivan do Carmo Machado Federal University of Bahia Computer Science Department ivanmachado@dcc.ufba.br Paulo A. da Mota S. Neto Federal University of Pernambuco Center for Informatics pamsn@cin.ufpe.br Eduardo Santana de Almeida Federal University of Bahia Computer Science Department esa@dcc.ufba.br Silvio Romero de Lemos Meira Federal University of Pernambuco Center for Informatics srlm@cin.ufpe.br ABSTRACT Change Requests (CRs) are key elements to software main- tenance and evolution. Finding the appropriate developer to a CR is crucial for obtaining the lowest, economically feasible, fixing time. Nevertheless, assigning CRs is a labor- intensive and time consuming task. In this paper, we present a semi-automated CR assignment approach which combines rule-based and information retrieval techniques. The ap- proach emphasizes the use of contextual information, essen- tial to effective assignments, and puts the development team in control of the assignment rules, toward making its adop- tion easier. Results of an empirical evaluation showed that the approach is up to 46,5% more accurate than approaches which rely solely on machine learning techniques. Categories and Subject Descriptors D.2.7 [Software Engineering]: Distribution, Maintenance, and Enhancement; D.2.9 [Software Engineering]: Man- agement—Life cycle, Productivity, Software quality assur- ance (SQA) Keywords Software Maintenance and Evolution; Change Request Man- agement; Automatic Change Request Assignment; Bug Triage 1. INTRODUCTION CRs are software artifacts that describe defects to be fixed or enhancements to be implemented in a software system [8]. CRs are managed with the support of a CR repository soft- ware, such as Bugzilla 1 . A CR repository plays a funda- 1 http://www.bugzilla.org Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from permissions@acm.org. ASE’14, September 15-19, 2014, Vasteras, Sweden. Copyright 2014 ACM 978-1-4503-3013-8/14/09$15.00. http://dx.doi.org/10.1145/2642937.2642964 . mental role in the software maintenance process, being a common place for communication and coordination among different stakeholders [5]. The task of assigning a CR, also referred to as CR triage, consists of selecting the most suitable software developer to handle a given CR. Generally, such a developer is the one who has enough expertise to handle the issues reported in the CR [2]. In addition, the assignment decision must take into account the developer’s workload, availability, and the CR priority, in order to obtain the lowest, economically feasible time to fix [9]. Thus, this task requires considerable knowledge of the project, and good communication skills to negotiate with involved stakeholders [7]. Assigning CRs to developers is both labor-intensive and time consuming, as it is usually regarded as a manual han- dling task [3]. Depending on the software project, the num- ber of new CRs can vary from dozens to hundreds in a single day [8]. As a consequence, the greater the number of CRs that are opened, the more complex the problem becomes. Several research have proposed automated approaches to overcome the problem of CR assignment by using Informa- tion Retrieval (IR) techniques [8]. Some of these approaches are based on the hypothesis that the most suitable developer to handle a new CR is the one who has already solved sim- ilar CRs [1, 3, 9, 11, 14]. While other approaches consider that an appropriate developer can be found by looking at past CRs and data from version control systems [6, 10, 13] or source code [12]. In general, these approaches use IR techniques to automatically suggest a list of appropriate de- velopers for a new incoming CR. Despite the number of proposals, there is no empirical ev- idence about their applicability to real-world environments. Most practitioners are still assigning CRs manually. We be- lieve that current approaches have not been adopted because of two main problems, as discussed next. Firstly, existing approaches are usually designed to be au- tonomous, in the sense that the software analysts do not have the control of the approach; they cannot modify the behavior of the approach. Without such control, in turn, the approach cannot be properly calibrated. As a consequence, if its performance is not adequate, it is simply discarded. Secondly, these approaches lack contextual information necessary to properly assign CRs. Software development