SUBMISSION TO IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 1 A Multi-Study Investigation Into Dead Code Simone Romano, Member, IEEE, Christopher Vendome, Member, IEEE Computer Society, Giuseppe Scanniello, Member, IEEE, and Denys Poshyvanyk, Member, IEEE Computer Society Abstract—Dead code is a bad smell and it appears to be widespread in open-source and commercial software systems. Surprisingly, dead code has received very little empirical attention from the software engineering research community. In this paper, we present a multi-study investigation with an overarching goal to study, from the perspective of researchers and developers, when and why developers introduce dead code, how they perceive and cope with it, and whether dead code is harmful. To this end, we conducted semi-structured interviews with software professionals and four experiments at the University of Basilicata and the College of William & Mary. The results suggest that it is worth studying dead code not only in the maintenance and evolution phases, where our results suggest that dead code is harmful, but also in the design and implementation phases. Our results motivate future work to develop techniques for detecting and removing dead code and suggest that developers should avoid this smell. Index Terms—Dead Code, Unreachable Code, Unused Code, Bad Smell, Empirical investigation, Multi-study. ✦ 1 I NTRODUCTION I N software engineering, dead code is unnecessary source code, because it is unused and/or unreachable (i.e., never executed) [1], [2]. The problem with dead code is that after a while it starts to “smell bad.” The older it is, the stronger and more sour the odor becomes [3]. This is because keeping dead code around could be harmful [1], [4], [5]. For example, M¨ antyl¨ a et al. [1] stated that dead code hinders the compre- hension of source code and makes its structure less obvious, while Fard and Mesbah [4] asserted that dead code affects software maintainability, because it makes source code more difficult to understand. In addition, developers could waste time maintaining dead code [5]. Dead code seems to be quite common too [5], [6], [7], [8]. For example, Brown et al. [6] reported that, during the code examination of an industrial software system, they found a large amount of source code (between 30 and 50 percent of the total) that was not understood or documented by any developer currently working on it. Later, they learned that this was dead code. Boomsma et al. [7] reported that on a subsystem of an industrial web system written in PHP, the developers removed 2,740 dead files, namely about 30% of the subsystem’s files. Eder et al. [5] studied an industrial software system written in .NET in order to investigate how much maintenance involved dead code. They found that 25% of all method genealogies 1 were dead. Romano et al. [8] focused on dead methods in desktop applications written in Java. They reported that the percentage of dead methods in these applications ranged between 5% and 10%. • S. Romano and G. Scanniello are with University of Basilicata, Potenza (PZ), Italy. E-mail: simone.romano@unibas.it and giuseppe.scanniello@unibas.it • C. Vendome and D. Poshyvanyk are with The College of William and Mary, Williamsburg, VA, USA. E-mail: cvendome@cs.wm.edu and denys@cs.wm.edu 1. A method genealogy is the list of methods that represent the evolution of a single method over different versions of a software system [5]. Furthermore, Yamashita and Moonen [9] reported that dead code detection is one of the features software profes- sionals would like to have in their supporting tools. Although there is some consensus on the fact that dead code is a common phenomenon [5], [6], [7], [8], it could be harmful [1], [4], [5], and it seems to matter to software pro- fessionals [9]; surprisingly, dead code has received very little empirical attention from the software engineering research community. In this paper, we present a multi-study investigation with multiple goals to understand when and why devel- opers introduce dead code, how they perceive and cope with it, and whether dead code is harmful. To this end, we conducted semi-structured interviews with software professionals and four experiments with students (a few of them had professional experience) from the University of Basilicata (Italy) and the College of William & Mary (USA). Our results demonstrate that it is worth studying dead code not only in the maintenance and evolution phases, where our results indicate that dead code is harmful, but also in the design and implementation stages. Our empirical results motivate future work on this topic to develop techniques for detecting and removing dead code. Paper structure. In Section 2, background information is provided and related work is discussed. In Section 3, we mo- tivate our multi-study investigation into dead code, which is then described is Section 4. The semi-structured interviews are presented in Section 5, while the results from these interviews and the threats that could affect their validity are presented in Section 6. Similarly, we first introduce the experiments in Section 7, and then the obtained results and the threats to validity in Section 8. The overall discussion of the results is presented in Section 9. Final remarks conclude the paper. 2 BACKGROUND AND RELATED WORK 2.1 Background Bad smells (shortly “smells”) are symptoms of poor design and implementation choices [10]. Fowler [11] defined 22