Journal of Software Testing, Verification, and Reliability (to appear) An Empirical Investigation of the Relationship Between Spectra Differences and Regression Faults Mary Jean Harrold, † Gregg Rothermel, ‡ Kent Sayre, ‡ Rui Wu, * Liu Yi ‡ † College of Computing Georgia Institute of Technology Atlanta, GA 30332 harrold@cc.gatech.edu ‡ Department of Computer Science Oregon State University Corvallis, OR 97331 {grother,ksayre,liuyi}@cs.orst.edu * Department of Computer and Information Science Ohio State University Columbus, OH 43210 rwu@cis.ohio-state.edu Abstract Many software maintenance and testing tasks involve comparing the behaviors of program versions. Program spectra have recently been proposed as a heuristic for use in performing such comparisons. To assess the potential usefulness of spectra in this context an experiment was conducted, examining the relationship between differences in program spectra and the exposure of regression faults (faults existing in a modified version of a program that were not present prior to modifications, or not revealed in previous testing), and empirically comparing several types of spectra. The results reveal that certain types of spectra differences correlate with high frequency — at least in one direction — with the exposure of regression faults. That is, when regression faults are revealed by particular inputs, spectra differences are likely also to be revealed by those inputs, though the reverse is not true. The results also suggest that several types of spectra that appear, analytically, to offer greater precision in predicting the presence of regression faults than other, cheaper, spectra may provide no greater precision in practice. These results have ramifications for future research on, and for the practical uses of, program spectra. Keywords: program spectra, software testing, empirical studies 1 Introduction Various software testing and maintenance tasks require comparisons of the behaviors of program versions. For example, when a program is modified, regression testing is used to compare the behavior of the modified version to the behavior of its previous version, in the hope of detecting regression faults – faults existing in a modified version of a program that were not present prior to modifications, or not revealed in previous testing. Similarly, when a modified program fails, the behaviors of versions are compared in the hope of locating the cause of that failure. Tasks such as these constitute a significant percentage of the costs of software testing and maintenance, and thus, techniques that reduce the costs of these tasks are valuable. 1