NotReal: A tool for voice-based Wizard of Oz studies Martin Porcheron martin.porcheron@nottingham.ac.uk Mixed Reality Lab School of Computer Science University of Nottingham, UK Joel E. Fischer joel.fscher@nottingham.ac.uk Mixed Reality Lab School of Computer Science University of Nottingham, UK Michel Valstar michel.valstar@nottingham.ac.uk Computer Vision/Mixed Reality Lab School of Computer Science University of Nottingham, UK ABSTRACT We present NottReal, an application designed for simulating Voice User Interfaces (VUIs) in Wizard of Oz studies. We briefy discuss the premise and advantages of the Wizard of Oz method before moving onto introducing the design of the application, which we have iteratively developed and refned through a number of studies. CCS CONCEPTS · Human-centered computing Systems and tools for in- teraction design; Natural language interfaces. KEYWORDS conversational interfaces, voice interfaces, vuis, woz ACM Reference Format: Martin Porcheron, Joel E. Fischer, and Michel Valstar. 2020. NottReal: A tool for voice-based Wizard of Oz studies. In 2nd Conference on Conversational User Interfaces (CUI ’20), July 22ś24, 2020, Bilbao, Spain. ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/3405755.3406168 1 THE WIZARD OF OZ METHOD The recent growth in popularity of Voice User Interfaces (VUIs), from smartphone assistants (e.g. Siri) through to smart speakers (e.g. Amazon Echo) have led to a recent resurgence of research in the HCI (e.g. [10, 16]) and CSCW (e.g. [12, 15, 17]) communities that examines the design and use of novel technologies such as ‘natural language’ interfaces. Implementing these sorts of technologies can be a complex, lengthy, and costly endeavour, involving a host of computational techniques including lookup [13], gestural/spatial recognition [7], robot control [18], mixed reality techniques [4], machine learning [3], or natural language processing [8]. Thus, when it comes to prototyping ideas or conducting research with these interfaces, the Wizard of Oz method or ‘experiment’ (often abbreviated to simply WOz or WoZ ) is often used as part of the development process [13]. The method prescribes that rather than actually implementing all the elements of a digital system, the ‘intelligence’ of a machine can be performed by a human operator concealed from the participant, who is led to believe that the system or machine itself is ‘intelligent’ [5]. The method was originally referred to as łexperimenter in the loopž [6, pp. 1ś2] or given the epithet of łThe Perfect Systemž [11, p. 843]. The theatrical name of Wizard of Oz, perhaps of no surprise, stems from the fctional novel The Wonderful Wizard of Oz [2]. The CUI ’20, July 22ś24, 2020, Bilbao, Spain © 2020 Copyright held by the owner/author(s). This is the author’s version of the work. It is posted here for your personal use. Not for redistribution. The defnitive Version of Record was published in 2nd Conference on Conversational User Interfaces (CUI ’20), July 22ś24, 2020, Bilbao, Spain, https://doi. org/10.1145/3405755.3406168. story evolves around the characters’ journey to meet a supposedly wonderful wizard that is later revealed to be a sham. The Wizard is, in fact, an ‘ordinary’ human behind a curtain controlling a machine. In other words, the wonderful wizard is an orchestrated illusion and this is what inspired the method’s name. The Wizard of Oz method profers designers and researchers the ability to develop medium-fdelity prototypes [13] that can be used for the exploration and testing of ideas as part of iterative design processes and research, and feeds into a number of diferent forms of analyses, from qualitative Conversation Analysis [19] through to quantitative task performance analysis [1, 9], as well as product development. However, these studies require a fexible tool to enable live performance of the VUI simulation to efectively respond to participants. 2 NOTTREAL NottReal is a cross-platform Python-based desktop application for Wizard-controlled voice interface studies, where the intent detec- tion and slot flling of typical natural language interfaces [14] is completed by a human operator. The primary window (see Figure 1a) consists of a number of controls for simulating a VUI. Through a number of internal re- search studies and our experience of the design of VUIs, we have progressively refned the application for quick operation during a study. The controls include: 1) tabbed lists of pre-scripted messages, 2) entry for custom messages, 3) currently queued messages, 4) previously sent messages, 5) previously flled slots, and 6) options to log events and send messages with a loading animation. Addi- tional options are also shownÐthese mostly originate from features which are enabled through command-line arguments. We iterate through this list in more detail below, explaining how each feature works in practice. 2.1 Main features and interaction design We now work through these features and the basis for their design. 2.1.1 Message queue and delivery. NottReal, by default, queues messages to send to the participant and blocks queue processing while a message is being ‘delivered’ to participants, e.g. through a text-to-speech (TTS) engine. This allows for multiple messages to be queued up with the intent being that the messages be delivered sequentially. This has been useful for us in situations when a large dialogue may be delivered in chunks to the participant. For delivering these messages, NottReal supports various TTS engines including CereVoice 1 (including support for CereVoice’s non-verbal features and spurts), macOS’s speech library 2 , engines 1 https://www.cereproc.com/en/products/sdk 2 https://ss64.com/osx/say.html