WALTy: A User Behavior Tailored Tool for Evaluating Web Application Performance G. Ruffo, R. Schifanella, and M. Sereno Dipartimento di Informatica Universit` a degli Studi di Torino Corso Svizzera, 185 - 10149 Torino (ITALY) R. Politi CSP Sc.a.r.l. c/o Villa Gualino Viale Settimio Severo, 65 - 10133 Torino (ITALY) Abstract In this paper we present WALTy (Web Application Load- based Testing tool), a set of tools that allows the perfor- mance analysis of web applications by means of a scalable what-if analysis on the test bed. The proposed approach is based on a workload characterization generated from infor- mation extracted from log files. The workload is generated by using of Customer Behavior Model Graphs (CBMG), that are derived by extracting information from the web ap- plication log files. In this manner the synthetic workload used to evaluate the web application under test is repre- sentative of the real traffic that the web application has to serve. One of the most common critics to this approach is that synthetic workload produced by web stressing tools is far from being realistic. The use of the CBMGs might be useful to overcome this critic. 1 Introduction One of the main important steps in Capacity Planning is performance prediction: the goal is to estimate perfor- mance measures of a web farm under test for a given set of parameters (i.e., response time, throughput, cpu and ram utilization, number of I/O disk accesses, and so on). There are two approaches to predict performance: it is possible to use a benchmarking suite to perform load and stress tests, and/or use a performance model. A performance model pre- dicts the performance of a web system as function of sys- tem description and workload parameters. The models can be either simulative or analytical. By using these models, performance measures such as response times, throughput, disk storage and computational resource consumption, can be derived. These measures can be used to plan an adequate capacity for the web system. On the other side, benchmark- ing and stressing suites are largely used in the industry for testing existing architectures with expected traffic. In par- ticular, stressing tools make what-if analysis practical in real systems, because workload intensity can be scaled to analyst hypothesis, i.e., a workload emulating the activity of N users with pre-defined behaviors can be replicated, in order to monitor a set of performance parameters dur- ing test. Such workload is based on sequences of object- requests and/or analytical characterization, but sometimes they are poorly scalable by the analyst; in fact, in many stressing framework, we can just replicate a (set of) man- ually or randomly generated sessions, losing in objectivity and representativeness. The main scope of this paper is to present a methodol- ogy based on the generation of a representative traffic. To address this task, we implemented WALTy, a set of tools using Customer Behavior Model Graphs for characterizing web user profiles and for generating virtual users web ses- sions. 1.1 Related Works The literature on workload characterization of web ap- plication is quite vast and therefore it is difficult to provide here a comprehensive summary of previous contributions. In this section we only summarize some of the approaches that have been successfully used so far in this field and that are most closely related to our work. Benchmarking tools are intended to evaluate the perfor- mance of a given web server by using a well-defined and often standardized workload model when performance ob- jectives and workloads are measurable and repeatable on different system infrastructures. The most important exam- ples are SpecWeb99 [3], Surge [10], TPC-W [16], WebStone [5] and WebBench [4]. Load testing tools, instead, evaluate the performance of a specific web site on the basis of actual users behavior, they capture real user requests for further replay where a set of test parameters can be modified. Httperf [18] developed at Hewlett-Packard, WebAppLoader [21], openSTA [2], Mer- cury Load Runner [1], S-Client [9] and Geist [14] are some Proceedings of the Third IEEE International Symposium on Network Computing and Applications (NCA’04) 0-7695-2242-4/04 $ 20.00 IEEE