Automatically Identifying Known Software Problems Natwar Modani 1 , Rajeev Gupta 1 , Guy Lohman 2 , Tanveer Syeda-Mahmood 2 , Laurent Mignet 1 1 IBM India Research Lab, Block-1, IIT Delhi Campus, New Delhi, India. 2 IBM Almaden Research Center, San Jose, CA, USA (namodani, grajeev, lamignet)@in.ibm.com; (stf, lohman)@almaden.ibm.com Abstract Re-occurrence of the same problem is very common in many large software products. By matching the symptoms of a new problem to those in a database of known problems, automated diagnosis and even self- healing for re-occurrences can be (partially) realized. This paper exploits function call stacks as highly structured symptoms of a certain class of problems, including crashes, hangs, and traps. We propose and evaluate algorithms for efficiently and accurately matching call stacks by a weighted metric of the similarity of their function names, after first removing redundant recursion and uninformative (poor discriminator) functions from those stacks. We also describe a new indexing scheme to speed queries to the repository of known problems, without compromising the quality of matches returned. Experiments conducted using call stacks from actual product problem reports demonstrate the improved accuracy (both precision and recall) resulting from our new stack-matching algorithms and removal of uninformative or redundant function names, as well as the performance and scalability improvements realized by indexing call stacks. We also discuss how call-stack matching can be used in both self-managing (or autonomic systems) and human “help desk” applications. 1. Introduction One of the biggest challenges to self-healing systems is correctly diagnosing a problem based upon its externalized symptoms. However, typically half – and sometimes as much as 90 percent – of all software problems reported by users today are re-occurrences, or rediscoveries, of known problems, i.e. those whose cause has already been ascertained or is under investigation. Such rediscoveries present a significant opportunity for automatically repairing systems by searching a database of symptoms of known problems to find the best match with the symptoms of any new problem. But how to uniquely characterize any problem by its symptoms, and how to match those characterizations accurately, remains challenging, in general. Fortunately, there is a large class of software problems for which fairly structured symptoms can be used to characterize the problem, namely those that produce a function call stack among its symptoms. Systems typically generate function call stacks (we’ll use the shorter term “call stacks” henceforth) when software “crashes”, is terminated after a “hang”, or an error is “trapped” and reported by the code itself. Call stacks reconstruct the sequence of function calls leading up to the failure via the operating system’s stack of addresses that is pushed each time a function is called and popped when it returns. Call stacks typically contain at least the function name and offset in the routine at which each subroutine was invoked or the problem occurred or was detected. This paper uses the call stack as a symptom to characterize such problems. Clearly, if two call stacks are identical, they almost surely represent the same problem, but what if they only partially match? The function in which the problem occurred is of course the most important to match, but that function may not be at the top of the stack if the code trapped the problem and invoked some standard routines to report and/or recover from the error, which provide little enlightenment on the nature of the problem. Furthermore, the path by which the execution got to the offending function may alter the values of key parameters that contribute to a problem, but the further in the stack we are from the offending function, the less likely that function is to have such an impact and the more likely it is to be common with other problem call stacks. And if the call stack contains recursive calls to the same functions, the number of recursions is rarely important for problem determination. Hence in our matching we want to weight matches nearer the top of the stack more heavily, after omitting redundant recursive invocations and the “uninformative functions” such as common error routines and entry-level routines. 433 1-4244-0832-6/07/$20.00 ©2007 IEEE.