Utility-Based Evaluation of Adaptive Systems Eelco Herder Department of Computer Science, University of Twente P.O. Box 217, 7500 AE, Enschede, The Netherlands herder@cs.utwente.nl Abstract. The variety of user-adaptive hypermedia systems available calls for methods of comparison. Layered evaluation techniques appear to be useful for this purpose. In this paper we present a utility-based evaluation approach that is based on these techniques. Issues that arise when putting utility-based evaluation into practice are dealt with. We also explain the need for interpretative user models and common sets of evaluation criteria for different domains. 1 Introduction As the Internet has become a common source of information and services, the need for web sites to cater a heterogeneous user population has increased dramatically. It has been shown hard to design interfaces that match all user needs in all user contexts, which might be partially due to the lack of well-founded design guidelines [11]. Adaptive hypermedia systems try to bridge the gap between sites and individual users by building models of the goals, preferences and knowledge of each individual user, and use this model throughout the interaction in order to adapt to the user needs [1]. Some decades of research have provided us with a huge collection of different user-adaptive systems. Unfortunately, thus far most adaptive systems are only compared to their non-adaptive counterparts [7]. This makes it hard to compare the results as reported in journals or conference proceedings, as the systems are evaluated against different criteria, by lack of well-defined or common criteria for the success of adaptive hypermedia systems [12]. Recently, the use of frameworks for layered evaluation of adaptive applications and services was advocated by a number of researchers [2][7][12]. Although these frameworks are described at different levels of granularity [12], in essence they separate the process in the evaluation of the interaction assessment phase and the evaluation of the adaptation decision making phase [7]. The basic intuition behind this approach is that unsuccessful adaptations might be due to incorrect assessment results, or to improper adaptations based on a correct assessment. Layered evaluation of adaptive systems appears to be a promising approach, as shown in a case study described in [2]. In the next paragraph the limitations of user models that are constructed in the interaction phase are described, and how they should be dealt with when evaluating adaptation decisions. In the third paragraph we propose a utility- based approach to layered evaluation. Further, it is argued why common sets of