Synthetic and Natural Speech Intelligibility
in Individuals with Visual Impairments:
Effects of Experience and Presentation
Rate
Marialena BAROUTI
a
, Konstantinos PAPADOPOULOS
a,1
and Georgios KOUROUPETROGLOU
b
a
Department of Educational and Social Policy,
University of Macedonia, Thessaloniki, Greece
c
Department of Informatics and Telecommunications,
National and Kapodistrian University of Athens, Greece
Abstract. The present study aims to compare the intelligibility of words produced in
natural and synthetic speech both with normal and fast speaking rate. The effect of several
individual parameters on the intelligibility of synthetic speech was also investigated. Thirty
adults with visual impairment took part in the research. The experimental design consisted
of two parts: a structured interview and a psychoacoustic test. The interviews were focused
on the participants’ demographic data experience in using synthetic speech. he results
indicated that participants performed more accurately in recognizing words presented in
normal speaking rate than in a fast rate. Moreover, the results indicated that the differences
between synthetic and natural speech were statistically significant.
Keywords. Visual impairments, speech intelligibility, synthetic speech.
Introduction
Synthetic speech perception is usually discussed in the literature with regard to
intelligibility and comprehension [1]. Intelligibility is the listener’s ability to recognize
phonemes and words presented in isolation [2], whereas comprehension involves the
extraction of the underlying meaning from the acoustic signals of speech [3].
There is an abundance of research carried out regarding the intelligibility of
synthetic speech produced by Text-to-Speech (TtS) systems experienced by non
disabled people. These studies showed that the intelligibility of natural speech is
significantly greater than that of TtS synthesis systems [4]. Nevertheless, limited
research is available on the perception of synthetic speech by individuals who have
visual impairments [5].
Stevens et al. [6] Found that the gender of the voice and the quality of the signal in
TtS synthesis affect the intelligibility. Moreover, previous studies indicate that
1
Corresponding Author: Konstantinos Papadopoulos, University of Macedonia, Department of Educational
and Social Policy, 156 Egnatias st., GR-54006 Thessaloniki, Greece. Tel.: +30 2310891403; Fax: +30
2310891388; E-mail: kpapado@uom.gr
Assistive Technology: From Research to Practice
P. Encarnação et al. (Eds.)
IOS Press, 2013
© 2013 The authors and IOS Press. All rights reserved.
doi:10.3233/978-1-61499-304-9-695
695