M. Kurosu (Ed.): Human Centered Design, HCII 2009, LNCS 5619, pp. 110–119, 2009. © Springer-Verlag Berlin Heidelberg 2009 Defining Expected Behavior for Usability Testing Stefan Propp and Peter Forbrig University of Rostock, Institute of Computer Science, Albert Einstein Str. 21, 18059 Rostock, Germany {stefan.propp,peter.forbrig}@uni-rostock.de Abstract. Within HCI task models are widely used for development and evaluation of interactive systems. Current evaluation approaches provide sup- port for capturing performed tasks and for analyzing them in comparison to a usability experts’ captured behavior. Analyzing the amount of data works fine for the evaluation of smaller systems, but becomes cumbersome and time- consuming for larger systems. Our developed method aims at making the im- plicitly existing expectations of a usability expert explicit to pave the way for automatically identifying candidates for usability issues. We have enhanced a CTT-like task modeling notation with a language to express expected behavior of test users. We present tool support to graphically compose expectations and to integrate them into the usability evaluation process. Keywords: Usability Evaluation, Task Models. 1 Introduction Task models are widely used within the domain of Human Computer Interaction. For elicting requirements task models describe the progress of task execution to accom- plish a certain goal. Subsequent development stages apply task models as initial arti- facts for model-based development of user interfaces [9]. Several approaches further exploit task models for usability evaluation. Examples are RemUSINE [6], ReModEl [1] and the task-based timeline visualization [4]. Those evaluation techniques capture the observed user interactions on a lower level of abstraction (e.g. mouse clicks or sensor values of user movement), which can be easily captured but the vast amount of data is difficult to interpret. Subsquently the sequence of captured events is lifted from an interaction level (e.g. button click) to a task-based abstraction (e.g. printing a document) [8], which allows interpreting the results in a more natural way. Hence a usability expert can conveniently compare the observed behavior of test users with his/her expectation of efficient task performance to reach the goal of the test case. Deviations indicate candidates for usability issues. This comparison is carried out as a manual process with tool support for visualizing a task trace, but lacking an integra- tion of machine-readable expectations. The approach in [6] goes a bit into this direction. It offers a comparison between two task traces: the observed trace and an “ideal path”. A designer specifies this path and the degree of deviation can be visualized. However, there is no opportunity dis- cussed to generalize the expectation to cover different expected traces. For instance