NotReal: A tool for voice-based Wizard of Oz studies
Martin Porcheron
martin.porcheron@nottingham.ac.uk
Mixed Reality Lab
School of Computer Science
University of Nottingham, UK
Joel E. Fischer
joel.fscher@nottingham.ac.uk
Mixed Reality Lab
School of Computer Science
University of Nottingham, UK
Michel Valstar
michel.valstar@nottingham.ac.uk
Computer Vision/Mixed Reality Lab
School of Computer Science
University of Nottingham, UK
ABSTRACT
We present NottReal, an application designed for simulating Voice
User Interfaces (VUIs) in Wizard of Oz studies. We briefy discuss
the premise and advantages of the Wizard of Oz method before
moving onto introducing the design of the application, which we
have iteratively developed and refned through a number of studies.
CCS CONCEPTS
· Human-centered computing → Systems and tools for in-
teraction design; Natural language interfaces.
KEYWORDS
conversational interfaces, voice interfaces, vuis, woz
ACM Reference Format:
Martin Porcheron, Joel E. Fischer, and Michel Valstar. 2020. NottReal: A tool
for voice-based Wizard of Oz studies. In 2nd Conference on Conversational
User Interfaces (CUI ’20), July 22ś24, 2020, Bilbao, Spain. ACM, New York,
NY, USA, 3 pages. https://doi.org/10.1145/3405755.3406168
1 THE WIZARD OF OZ METHOD
The recent growth in popularity of Voice User Interfaces (VUIs),
from smartphone assistants (e.g. Siri) through to smart speakers
(e.g. Amazon Echo) have led to a recent resurgence of research in
the HCI (e.g. [10, 16]) and CSCW (e.g. [12, 15, 17]) communities that
examines the design and use of novel technologies such as ‘natural
language’ interfaces. Implementing these sorts of technologies can
be a complex, lengthy, and costly endeavour, involving a host of
computational techniques including lookup [13], gestural/spatial
recognition [7], robot control [18], mixed reality techniques [4],
machine learning [3], or natural language processing [8]. Thus,
when it comes to prototyping ideas or conducting research with
these interfaces, the Wizard of Oz method or ‘experiment’ (often
abbreviated to simply WOz or WoZ ) is often used as part of the
development process [13]. The method prescribes that rather than
actually implementing all the elements of a digital system, the
‘intelligence’ of a machine can be performed by a human operator
concealed from the participant, who is led to believe that the system
or machine itself is ‘intelligent’ [5].
The method was originally referred to as łexperimenter in the
loopž [6, pp. 1ś2] or given the epithet of łThe Perfect Systemž [11,
p. 843]. The theatrical name of Wizard of Oz, perhaps of no surprise,
stems from the fctional novel The Wonderful Wizard of Oz [2]. The
CUI ’20, July 22ś24, 2020, Bilbao, Spain
© 2020 Copyright held by the owner/author(s).
This is the author’s version of the work. It is posted here for your personal use. Not
for redistribution. The defnitive Version of Record was published in 2nd Conference
on Conversational User Interfaces (CUI ’20), July 22ś24, 2020, Bilbao, Spain, https://doi.
org/10.1145/3405755.3406168.
story evolves around the characters’ journey to meet a supposedly
wonderful wizard that is later revealed to be a sham. The Wizard is,
in fact, an ‘ordinary’ human behind a curtain controlling a machine.
In other words, the wonderful wizard is an orchestrated illusion
and this is what inspired the method’s name.
The Wizard of Oz method profers designers and researchers
the ability to develop medium-fdelity prototypes [13] that can be
used for the exploration and testing of ideas as part of iterative
design processes and research, and feeds into a number of diferent
forms of analyses, from qualitative Conversation Analysis [19]
through to quantitative task performance analysis [1, 9], as well
as product development. However, these studies require a fexible
tool to enable live performance of the VUI simulation to efectively
respond to participants.
2 NOTTREAL
NottReal is a cross-platform Python-based desktop application for
Wizard-controlled voice interface studies, where the intent detec-
tion and slot flling of typical natural language interfaces [14] is
completed by a human operator.
The primary window (see Figure 1a) consists of a number of
controls for simulating a VUI. Through a number of internal re-
search studies and our experience of the design of VUIs, we have
progressively refned the application for quick operation during a
study. The controls include: 1) tabbed lists of pre-scripted messages,
2) entry for custom messages, 3) currently queued messages, 4)
previously sent messages, 5) previously flled slots, and 6) options
to log events and send messages with a loading animation. Addi-
tional options are also shownÐthese mostly originate from features
which are enabled through command-line arguments. We iterate
through this list in more detail below, explaining how each feature
works in practice.
2.1 Main features and interaction design
We now work through these features and the basis for their design.
2.1.1 Message queue and delivery. NottReal, by default, queues
messages to send to the participant and blocks queue processing
while a message is being ‘delivered’ to participants, e.g. through a
text-to-speech (TTS) engine. This allows for multiple messages to
be queued up with the intent being that the messages be delivered
sequentially. This has been useful for us in situations when a large
dialogue may be delivered in chunks to the participant.
For delivering these messages, NottReal supports various TTS
engines including CereVoice
1
(including support for CereVoice’s
non-verbal features and spurts), macOS’s speech library
2
, engines
1
https://www.cereproc.com/en/products/sdk
2
https://ss64.com/osx/say.html