Introduction
Two different speech coding systems exist in the
cochlea: one represents spectral information encoded
tonotopically according to the space theory,
1
while the other codes for temporal information, as
represented by the firing pattern of the primary
auditory neurons. Spectral information is known to
play an important role in speech recognition, such as
the formant frequencies in vowel recognition.
2
In
contrast, the role of temporal information in speech
recognition has not been fully elucidated, despite
many previous animal experiments
3–7
which have
confirmed the production of temporal information in
the cochlea. One reason for this lack of information
is the lack of appropriate methods for investigating
the temporal coding system on the auditory cortex
in humans.
In the present study, a train of clicks of ultra-short
duration were constructed from original speech
sounds, which would be expected to contain only
temporal information with stable spectral structures.
We used this sound stimulation to examine the role
of temporal information in speech recognition subjec-
tively by a recognition test, and objectively by
positron emission tomography (PET) in normal
subjects.
Materials and Methods
Method for constructing stimulation sounds:
Original speech sound waves were recorded by a
digital audio-tape (DAT) recorder and transferred to
a computer (Macintosh Quadra 950, Apple Inc.). We
identified the zero-crossing points where the original
speech sound waves crossed the baseline from nega-
tive to positive and then measured each of the inter-
vals between two zero-crossing points. The
stimulation sound was constructed by arranging
biphasic clicks with silent intervals, which had been
measured above as the zero-crossing intervals. The
duration of each click was 41.6 s. The computer
program for this processing was written using
Symantec C++ 7.0. Figure 1 shows the original speech
sound waveform [e] and the stimulation sound made
from this waveform. Thus, the stimulation sounds
were composed of click sequences.
Recognition tests: To examine whether the stimula-
tion speech sounds could be easily recognized, 10
healthy volunteers (all male, mean age 32 years, range
26–42) underwent recognition tests for these stimu-
lation sounds.
In test 1, the stimulation sounds made from five
vowels were presented randomly twice per vowel,
Auditory and Vestibular Systems, Lateral Line
1
1
1
1
1
p
© Rapid Science Publishers Vol 8 No 9–10 7 July 1997 2395
TO elucidate the temporal coding system for speech
recognition, we synthesized stimulation sounds which
do not contain formant information but do contain
temporal information by transforming original sound
wave to click sequences. Using this stimulation sound,
we performed a recognition test and used PET to
examine the cortical activities in normal subjects
listening to this sound. The results of the recognition
test showed a good perception of the sounds made from
sequential speech. The PET study demonstrated signifi-
cant activation of the superior temporal gyri while
listening to the stimulation speech sounds. Our results
imply that these stimulation sounds were processed
semantically in the auditory cortices. The temporal
processing system is thought to make an important
contribution to speech recognition.
Key words: Auditory cortex; Click sequences; PET; Speech
recognition; Temporal coding
The role of the temporal
coding system in the
auditory cortex on
speech recognition
Hisayoshi Kojima,
CA
Shigeru Hirano,
Kazuhiko Shoji, Yasushi Naito,
Iwao Honjo, Yoko Kamoto,
1
Hidehiko Okazawa,
1
Koichi Ishizu,
1
Yoshiharu Yonekura,
2
Yasuhiro Nagahama,
3
Hidenao Fukuyama,
3
Junji Konishi
1
Department of Hearing and Speech Science,
1
Nuclear Medicine, and
3
Neurology, Graduate
School of Medicine, Kyoto University, Sakyo-
ku, Kyoto 606;
2
Biomedical Imaging Research
Center, Fukui Medical School, Shimoaizuki,
Matuoka, Fukui 910-11, Japan
CA
Corresponding Author
NeuroReport 8, 2395–2398 (1997)