CORRECTING ASR OUTPUTS: SPECIFIC SOLUTIONS TO SPECIFIC ERRORS IN FRENCH
Richard Dufour, Yannick Estève
LIUM, Université du Maine
Le Mans, France
firstname.lastname@lium.univ-lemans.fr
ABSTRACT
Automatic speech recognition (ASR) systems are used
in a large number of applications, in spite of the inevitable
recognition errors. In this study we propose a pragmatic ap-
proach to automatically repair ASR outputs by taking into
account linguistic and acoustic information, using formal
rules or stochastic methods. The proposed strategy consists
in developing a specific correction solution for each specific
kind of errors. In this paper, we apply this strategy on two
case studies specific to French language. We show that it
is possible, on automatic transcriptions of French broadcast
news, to decrease the error rate of a specific error by 11.4%
in one of two the case studies, and 86.4% in the other one.
These results are encouraging and show the interest of devel-
oping more specific solutions to cover a wider set of errors in
a future work.
Index Terms— Automatic speech recognition, error cor-
rection, homophones, language modeling
1. INTRODUCTION
Automatic speech recognition (ASR) systems are increas-
ingly efficient. Their actual performance is sufficient for them
to be used in a large number of applications (human-machine
dialogue, indexing, information retrieval, etc.).
But ASR errors are inevitable. Errors changing the mean-
ing of a sentence are very annoying for most applications us-
ing ASR, because they prohibit correct feedbacks from these
applications. Other kinds of errors, which do not prevent un-
derstanding, are often neglected because they are not critical
for the correct operation of such applications for example, in
English, errors of agreement in number.
For some other applications, as subtitling for hearing-
impaired people or assisted transcription [1] , these errors
are more important: in the former case, repetition of errors,
even if they do not modify the meaning of a sentence, is very
tiring for the final user; in the latter case, where the goal is to
produce an entirely correct transcription, these errors reduce
the gain of productivity provided by the use of ASR.
This research was supported by the ANR (Agence Nationale de la
Recherche) under contract number ANR-06-MDCA-006.
French contains a lot of homophonous words, particularly
among the various inflected forms of a same word. Frequent
ASR errors results from it. In this context, it should be very
interesting to have a method to correct such errors.
In the literature, we can find propositions to repair ASR
errors or make the applications robust to these errors [2, 3].
Usually, propositions to repair ASR errors tend to be general
and try to repair every kind of errors [4]. They can also take
into consideration some particularities of the focused applica-
tion, for example the dialog history [4].
In this paper, we propose a different approach consisting
in building a specific correction solution for each specific er-
ror. A similar approach was proposed in [5] for use inside
an ASR decoder to handle different language models. Here,
we propose to use it in order to repair some errors by post-
processing the ASR output.
This approach is very pragmatic: it consists in manually
analyzing the most frequent errors, particularly the most fre-
quent confusion pairs. Errors on different words can either be
treated as a group, or be processed confusion pair by confu-
sion pair. The various solutions proposed for the various kinds
of errors, can use heterogeneous tools such as formal rules or
stochastic methods. These tools can be built from heteroge-
neous data, such as linguistic knowledge but also acoustic in-
formation provided by the ASR system. The use of acoustic
information to repair some ASR errors at a post-processing
level is another contribution of this paper.
This paper focuses on the French language. Two case
studies are presented, consisting in correcting specific er-
rors caused two particularities of French. The next section
presents some peculiarities of French. Then the proposed
approach is detailed in section 3, before the presentation of
the tools used during this work. Finally, experiments are
described and results are presented.
2. SOME PECULIARITIES OF FRENCH
The correspondence of gender, number (and/or person) is one
of the most difficult aspects of the French language. French
is an inflected language. A great difficulty for ASR systems
(and for some people) is that in a lot of cases, the various in-
213 978-1-4244-3472-5/08/$25.00 ©2008 IEEE SLT 2008