The Design of Perceptual Representations for Practical Networked Multimodal Virtual Training Environments Matthew Hutchins, Matt Adcock, Duncan Stevenson, Chris Gunn and Alexander Krumpholz CSIRO ICT Centre GPO Box 664 Canberra ACT 2601, Australia Matthew.Hutchins@csiro.au Abstract This paper is a discussion of our experience in designing perceptual representations for virtual environments for surgical training. Networked virtual environments can be used for training and education, and indeed can have some benefits over hands-on training in the real world. In designing content for such an environment there is a tension between including realistic models of real world objects and choosing other types of representations. We present AFRADERVITE, a framework for describing representations in virtual training environments. The framework identifies which representations are needed, and classifies them according to directness. The paper presents examples of representations taken from surgical training systems for cholecystectomy and temporal bone drilling. 1 Introduction 1.1 Networked multimodal virtual environments for communication The term “virtual reality” has many definitions, and for the purposes of this paper we will take the point of view that virtual reality (VR) is essentially a communication tool (Riva 1999; Steuer 1993). In particular, we are concerned with the design of networked virtual environments that act as enhanced communication channels between two (or more) participants. That is, the virtual environment is a conceptual place where people go to communicate. They communicate “in” and through the virtual environment. We assume that the virtual environment is populated with computer-generated objects and events that serve to enrich the communication experience of the participants. The objects and events in the environment, and the participants themselves, are represented in such a way as to be perceived by the participants using one or more direct senses, typically vision, hearing and touch, and higher order constructions such as read language and interpreted speech. Communication and action within the environment could use any of several modes, including selection, manipulation, gesture, expression, speech, non-speech sound, written (or typed) language, and drawing. We will not be too concerned with the boundary between real and virtual in the environment, and later discuss the inclusion of real or actual objects, events and experiences within a virtual environment. Milgram and Kishino (1994) have introduced the notion of the virtual reality continuum, ranging from real environments, through augmented reality and augmented virtuality (virtual environments augmented with some real objects), to completely virtual environments. Our discussion will assume environments somewhere along this continuum, but with at least some virtual content. 1.2 Virtual training environments Practical virtual environments are, typically, designed spaces that have a specific purpose. One of the most compelling purposes for this technology is education and training (Bricken 1991). If we compare training in virtual reality with hands-on training in the real world, we can identify several advantages to training in VR. • Safety. In training situations where mistakes could lead to danger to the trainee or instructor, consumers, bystanders, equipment or the environment, mistakes can not easily be tolerated. However, making mistakes and learning from them is an important part of training. VR training provides the trainee with the freedom to fail safely. • Economy and accessibility. VR training allows repetition without the consumption of physical resources. Also, in situations where training requires access to rare or valuable facilities, VR training could lead to less downtime and higher accessibility for students (assuming the VR environment itself is not a rare and inaccessible resource).