Analysis of Glottal Stop in Assam Sora Language
Sishir Kalita
1
, Luke Horo
2
, Priyankoo Sarmah
2
, S.R.M. Prasanna
1
, S. Dandapat
1
1
Department of Electronics and Electrical Engineering
2
Department of Humanities and Social Sciences
Indian Institute of Technology Guwahati, Guwahati-781039, India
(sishir, luke, priyankoo, prasanna, samaren)@iitg.ernet.in
Abstract
The objective of this work is to characterize the intervocalic
glottal stops in Assam Sora. Assam Sora is a low resource lan-
guage of the South Munda language family. Glottal stops are
produced with gestures in the deep laryngeal level; hence, the
estimated excitation source signal is used in this study to charac-
terize the source dynamics during the production of Assam Sora
glottal stops. From that, temporal domain voice source features,
Quasi-Open Quotient (QOQ) and Normalized Amplitude Quo-
tient (NAQ) are extracted along with spectral features such as
H1-H2 ratio and Harmonic Richness Factor (HRF). One exci-
tation source feature is extracted from the zero frequency fil-
tered version of the speech signal to characterize the variations
within the glottal cycles in glottal stop region. A recently pro-
posed wavelet based voice source feature, Maxima Dispersion
Quotient (MDQ) is also used to characterize the abrupt glottal
closure during glottal stop production. From the analysis, it is
observed that the features are salient enough to uniquely char-
acterize glottal stops from the adjacent vowel sounds and may
also be used in continuous speech. A Mann-Whitney U test
confirmed the statistical significance of the differences between
glottal stops and their adjacent vowels.
Index Terms: Assam Sora Language, glottal stop, zero fre-
quency filter, maxima dispersion quotient.
1. Introduction
Speech sounds are produced with gestures in the sub or supra-
laryngeal level. However, some sound units are articulated only
in the larynx without any effective gestures in the vocal tract. A
glottal stop, defined as a stop made by the glottis, is an example
of such sound that is produced by firmly adducting the vocal
folds [1]. In the glottal continuum model [2], the glottal stop is
considered to be an extreme form of glottal closure and is placed
at the right edge of the continuum, while voiceless sounds are
placed at the extreme left edge. However, it is suggested that a
complete glottal stop is rare in continuous speech [3].
Apart from producing glottal stops as a phonological unit
in a language, they can be produced as compensatory articula-
tions. For example, Cleft Lip and Palate (CLP) patients pro-
duce glottal stops as compensatory articulations [4], while in
English, glottal stop occurs as an allophone of the stop con-
sonant /t/. At the same time, many Austro-Asiatic languages
of the Mon-Khmer subfamily such as Khmer, Chong, Kammu,
Car, Khasi, Pnar, Katu, Dannu, Mon, Bunong, Sedang and Kui
as well as of the Munda subfamily such as Santali, Mundari,
KeraP, Ho, Korku, Juang, Kharia, Sora, Gorum, Remo, Gutob
and GtaP [5] [6] include glottal stops in their phoneme inven-
tories. However, analysis and characterization of glottal stops
with the help of a spectrogram is difficult [7] [8]. As there is
no movement of the supralaryngeal articulator, information re-
garding articulation in the larynx cannot be obtained. Hence, a
significant voice (excitation) source analysis is needed to char-
acterize this sound unit.
Production of glottal stops has a variety of realizations
ranging from a complete stop to a laryngealized realization.
Similarly, acoustic characteristics of a glottal stop differs sig-
nificantly depending on the context in which it occurs [9]. For
instance, while it is suggested that intervocalically a dip in the
pitch and amplitude contour are reliable cues for perceiving a
glottal stop [10], it is argued that irregularity and aperiodic-
ity of estimated source signals may also serve as dependable
cues of identifying a glottal stop in the same region [7] [8] [11].
Moreover, in the production of a glottal stop, as the larynx pri-
marily has an effective gesture and vocal fold vibration is sig-
nificantly deviated from the adjacent voiced region, analysis of
glottal stop may also be conducted using aerodynamic parame-
ters, EGG signals and estimated voice source from speech sig-
nals. However, it is preferable to analyze glottal stops directly
from the speech signal so that estimated voice source may pro-
vide a better way of characterizing the acoustic qualities of a
glottal stop.
A few attempts have been made to automatically character-
ize a glottal stop using speech signal processing. These studies
have mostly used excitation source information to characterize
a glottal stop. In one such study, the irregularity during a glot-
tal stop region using a normalized cross correlation between
two adjacent glottal cycles is quantified in a linear prediction
residual of the speech signal [8]. Also, in order to detect the
glottal stop in continuous speech of Amharic, normalized jit-
ter and logarithm peak normalized excitation strength (LPNES)
at each glottal closure instant (GCI) is computed [7]. Addi-
tionally, pitch synchronous integrated linear prediction resid-
ual is also used as voice source representation to characterize
the glottal stop in intervocalic context [11]. This has helped in
capturing the variation in the abruptness of glottal pulses using
the ratio between the strength of excitation (SoE) at two con-
secutive epoch locations and temporal energy distribution using
waveform peak factor (WPF). The asymmetric behavior of each
glottal cycle is extracted using higher order statistical (HOS)
measures.
The current study proposes a characterization method for
intervocalic glottal stops in a South Munda language called As-
sam Sora, spoken by approximately 5000 people in Assam of
North East India. Assam Sora has emerged due to the migra-
tion of Sora speakers from Orissa to Assam in the 19
th
cen-
tury. While the presence of a glottal stop in Sora has been re-
ported [12], its presence in Assam Sora is also observed [13].
Copyright © 2016 ISCA
INTERSPEECH 2016
September 8–12, 2016, San Francisco, USA
http://dx.doi.org/10.21437/Interspeech.2016-877 1049