IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 1 Quantifying the Effect of Code Smells on Maintenance Effort Dag I.K. Sjøberg, Member, IEEE, Aiko Yamashita, Student Member, IEEE, Bente Anda, Audris Mockus, Member, IEEE and Tore Dybå, Member, IEEE Abstract—Context: Code smells are assumed to indicate bad design that leads to less maintainable code. However, this assumption has not been investigated in controlled studies with professional software developers. Aim: This paper investigates the relationship between code smells and maintenance effort. Method: Six developers were hired to perform three maintenance tasks each on four functionally equivalent Java systems originally implemented by different companies. Each developer spent three to four weeks. In total, they modified 298 Java files in the four systems. An Eclipse IDE plug-in measured the exact amount of time a developer spent maintaining each file. Regression analysis was used to explain the effort using file properties, including the number of smells. Results: None of the 12 investigated smells was significantly associated with increased effort after we adjusted for file size and the number of changes; Refused Bequest was significantly associated with decreased effort. File size and the number of changes explained almost all of the modeled variation in effort. Conclusion: The effects of the 12 smells on maintenance effort were limited. To reduce maintenance effort, a focus on reducing code size and the work practices that limit the number of changes may be more beneficial than refactoring code smells. Index Terms—Maintainability, object-oriented design, product metrics, code churn —————————— Æ —————————— 1 INTRODUCTION major challenge in the modern information society is ensuring the maintainability of increasingly large and complex software systems. Many measures have been proposed to predict software maintainability [37], but the empirical quantification linking maintainability and measurable attributes of software, such as code smells, remains elusive. The concept of code smell was introduced as an indicator of problems within the design of software [16]. Detection of code smells have become an established method to indicate software design issues that may cause problems for further development and maintenance [16], [24], [31]. Consequently, the consensus is that code with smells should be refactored to prevent or reduce such problems [29]. However, refactoring entails both costs and risks. Thus, empirical evidence quantifying the relationship between code smells and software maintenance effort is needed to weigh the risks and benefits. A recent systematic review [46] found only five studies that investigated the impact of code smells on maintenance. Most of the studies on code smells that were identified in the review focused on tools and methods used to detect such smells automatically. In this article, we extend that review by considering a longer time span and more sources. Overall, the results from these studies are inconclusive; little evidence exists for the extent to which and under what circumstances various code smells are harmful. Furthermore, we are unaware of any controlled in vivo studies with professional developers on the effect of code smells on maintenance effort. Therefore, we conducted a controlled study to quantify the relationship between code smells and maintenance effort in an industrial setting with professional developers. Our particular research question focused on the extent to which the following 12 code smells affect the maintenance effort: Data Class, Data Clump, ”Duplicated code in conditional branches”, Feature Envy, God Class, God Method, Interface Segregation Principle (ISP) Violation, Misplaced Class, Refused Bequest, Shotgun Surgery, “Temporary variable used for several purposes” and “Implementation used instead of interface”. These smells are described briefly in Table 9 of the Appendix. A detailed description of most of these smells can be found in [8] and [16]. This study was conducted on four different but functionally equivalent (with the same requirements specifications) web-based information systems originally implemented (primarily in Java) by different contractors [3]. A study on the maintainability of these four systems compared structural measures and expert assessments [2] before the systems became operational. The four systems were operated in parallel once they were completed. The internal and external users were automatically assigned to one of the systems. Every time a particular user logged in, he or she was given access to the same system based on the IP address of the user's xxxx-xxxx/0x/$xx.00 © 2012 IEEE A ———————————————— ‚ Dag I.K. Sjøberg is with the Department of Informatics, University of Oslo, PO Box 1080 Blindern, NO-0316, Oslo, Norway. E-mail: dagsj@ifi.uio.no. ‚ Aiko Yamashita is with the Department of Informatics, University of Oslo, Norway. E-mail: aiko@simula.no. ‚ B.C.D. Anda is with the Department of Informatics, University of Oslo, Norway. E-mail: bentea@ifi.uio.no. ‚ A. Mockus is with Avaya Labs Research, Basking Ridge, NJ 07920. E-mail: audris@avaya.com. ‚ Tore Dybå is with the Department of Informatics, University of Oslo and SINTEF, Norway. E-mail: tore.dyba@sintef.no. Digital Object Indentifier 10.1109/TSE.2012.89 0098-5589/12/$31.00 © 2012 IEEE This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.