Exploring a text-to-speech feature by describing learning experience,
enjoyment, learning styles, and values – A basis for future studies
Margit Kastner Brigitte Stangl
Department of Marketing School of Hospitality and Tourism Management
Vienna University of Economics and Business University of Surrey
margit.kastner@wu.ac.at b.stangl@surrey.ac.uk
Abstract
Speech is the most natural form of face-to-face
communication. Due to more sophisticated information
systems and advanced educational requirements
speech also recognizes growing importance in human-
computer interaction. The present study investigates a
text-to-speech (TTS) feature in a learning context. 252
questionnaires allow for descriptions concerning
positive and negative experiences of TTS learners.
Additionally, descriptive insights for enjoyment factors
are provided and differences between German and
English texts are shown. Furthermore, preferences of
different learning styles and values of TTS features
conveyed are explored. Findings provide a starting
point for more specific future studies through insights
into TTS evaluation in a learning context. Based on
positive and negative experiences 13 dimensions
relevant for a performance measurement scale are
suggested. It is shown that among others, theoretical
texts and exercises are appreciated as TTS especially
by the aural learning style to enable for instance
language learning on the go.
1. Introduction
Since the advent of e-learning systems, technology
has changed dramatically. The first generation
supported educators in distributing printed materials
[60] and students in studying at their preferred time,
pace, and place [8]. The next generations eliminated
the disadvantages of limited interaction [40] due to the
implementation of communication features such as
audio- or video-conferencing [60]. Nowadays, many
different features are available to enrich the learning
experience [32]. Systems have progressed to
personalized ones based on learner preferences,
previous knowledge, or different learning styles [6,
26]. Thanks to technological advances aural, visual,
and kinesthetic learners can also be supported by an
e-learning system providing different features. One of
these features is text-to-speech (TTS) a technology that
converts anything written on the screen to spoken
words [1]. The importance of TTS is also increasing
because according to the “Accessibility Market and
Stakeholder Analysis” there are 1,910,000 visually
impaired people in Europe [7] and this number is
increasing due to the aging population [46]. In order to
facilitate lifelong learning for visually impaired and
aging people as well as to support aural learners,
improvements of TTS and research exploring potential
didactical use of this technology [20] are imperative.
The aim of this project is a user-based evaluation of a
TTS feature in the learning context of social sciences.
We intend to give insights on a broad scope of the
topic which are expected to be a starting point for
several more in-depth studies in the future. More
precisely the following aspects are examined and
reported in a descriptive way: i) experiences with a
TTS feature, ii) differences regarding perception of a
TTS feature depending on whether people listen to a
German or an English text, iii) appropriate TTS based
learning material and learning enhancement areas, iv)
differences with respect to the preferred learning style
of a person, v) benefits and values delivered by a TTS
feature. We want to highlight that these findings will
allow for a comprehensive list of dimensions that will
be essential to the development of a TTS performance-
measurement scale in follow-up research. To provide a
valuable starting point for future studies descriptive
results will be visualized applying various modes of
presentation.
2. Theoretical Background
2.1. Text-to-speech
The most important and natural mode of
communication in human-human interaction is speech
[59]. However, when taking a look at human-computer
interaction the most popular mode of communication
still relies on the keyboard and mouse as input devices;
2013 46th Hawaii International Conference on System Sciences
1530-1605/12 $26.00 © 2012 IEEE
DOI 10.1109/HICSS.2013.214
2
2013 46th Hawaii International Conference on System Sciences
1530-1605/12 $26.00 © 2012 IEEE
DOI 10.1109/HICSS.2013.214
3