113
Original Article
INTRODUCTION
The slow time-intensity modulation of the speech envelope, defined as fluctuations in
the overall amplitude at rates between about 2 Hz and 50 Hz, can convey important lin-
guistic information, manner of articulation, the presence of voicing, and some prosodic
information [1]. The importance of these temporal envelope cues to speech perception
has been demonstrated because it is believed that the cues can be treated as the only
information available to people with severe or profound sensorineural hearing loss and
for cochlear implant (CI) users [2-5]. If listeners with severe or profound hearing loss
can only utilize limited fine spectral and temporal information [6], temporal envelope
Purpose: The goal of the present study was to investigate the effect of temporal envelope
cues on consonant confusions.
Methods: The temporal envelope was extracted from each of 16 consonant-vowel (CV)
sounds using the psychoacoustic-based 26 critical auditory bands. Temporal smearing of
these processed signals was produced by applying low-pass filters (LPF) with one of five
cutoff frequencies. Confusion matrices were measured in normal hearing listeners as a
function of a signal-to-noise ratio (SNR).
Results: The results showed that temporal envelope information processed by the critical
auditory bands provides much poorer consonant cues, compared to ones processed with
wider and fewer numbers of auditory bands. The error rate for consonant perception de-
creased with the increase in temporal modulation across SNR, with higher weight on the
SNR than on the LPF. The results also showed the three sound groupings: four CVs were the
most difficult sounds, seven CVs were the easiest sounds, and five CVs were influenced the
most by LPF cutoffs. The confusion patterns were similar between the unprocessed CVs and
the temporally-processed CVs. Duration contributed the most while affrication contributed
the lowest for consonant perception.
Conclusions: Consonant perception is largely influenced when the LPF cutoff is lower than
8 Hz. Confusion patterns are similar between the natural consonants and the temporally-
processed consonants, even though the overall error rate is higher with the temporal enve-
lope cues. The results of the current study could provide control data for the many cochlear
implant studies that used acoustic simulations with a vocoder.
Keywords: Temporal envelopes, Confusion patterns, Consonant perception, Acoustic simu-
lations
© 2019 The Korean Association of Speech-
Language Pathologists
This is an Open Access article distributed under the
terms of the Creative Commons Attribution Non-
Commercial License (http://creativecommons.org/
licenses/by-nc/4.0/) which permits unrestricted non-
commercial use, distribution, and reproduction in any
medium, provided the original work is properly cited.
Received: April 10, 2019
Revision: July 12, 2019
Accepted: July 24, 2019
Correspondence:
Yang-Soo Yoon
Department of Communication Sciences
and Disorders, Baylor University, One
Bear Place #97332, Waco, TX 76798,
USA
Tel: +1-254-710-6364
E-mail: yang-soo_yoon@baylor.edu
Clinical Archives of Communication Disorders / Vol. 4, No. 2:113-127 / August 2019
http://e-cacd.org/ eISSN: 2508-5948
Perceptual confusions for temporally smoothed
envelope of consonants in normal hearing listeners
Yang-Soo Yoon
1
, David M. Gooler
2
, Jaesook Gho
3
1
Department of Communication Sciences and Disorders, Baylor University, TX;
2
Department of Communication Disorders, University of
Massachusetts, MA;
3
Department of Family and Consumer Sciences, Baylor University, TX, USA
Open Access
https://doi.org/10.21849/cacd.2019.00045