CORRECTING ASR OUTPUTS: SPECIFIC SOLUTIONS TO SPECIFIC ERRORS IN FRENCH Richard Dufour, Yannick Estève LIUM, Université du Maine Le Mans, France ﬁrstname.lastname@lium.univ-lemans.fr ABSTRACT Automatic speech recognition (ASR) systems are used in a large number of applications, in spite of the inevitable recognition errors. In this study we propose a pragmatic ap- proach to automatically repair ASR outputs by taking into account linguistic and acoustic information, using formal rules or stochastic methods. The proposed strategy consists in developing a speciﬁc correction solution for each speciﬁc kind of errors. In this paper, we apply this strategy on two case studies speciﬁc to French language. We show that it is possible, on automatic transcriptions of French broadcast news, to decrease the error rate of a speciﬁc error by 11.4% in one of two the case studies, and 86.4% in the other one. These results are encouraging and show the interest of devel- oping more speciﬁc solutions to cover a wider set of errors in a future work. Index Terms— Automatic speech recognition, error cor- rection, homophones, language modeling 1. INTRODUCTION Automatic speech recognition (ASR) systems are increas- ingly efﬁcient. Their actual performance is sufﬁcient for them to be used in a large number of applications (human-machine dialogue, indexing, information retrieval, etc.). But ASR errors are inevitable. Errors changing the mean- ing of a sentence are very annoying for most applications us- ing ASR, because they prohibit correct feedbacks from these applications. Other kinds of errors, which do not prevent un- derstanding, are often neglected because they are not critical for the correct operation of such applications for example, in English, errors of agreement in number. For some other applications, as subtitling for hearing- impaired people or assisted transcription [1] , these errors are more important: in the former case, repetition of errors, even if they do not modify the meaning of a sentence, is very tiring for the ﬁnal user; in the latter case, where the goal is to produce an entirely correct transcription, these errors reduce the gain of productivity provided by the use of ASR. This research was supported by the ANR (Agence Nationale de la Recherche) under contract number ANR-06-MDCA-006. French contains a lot of homophonous words, particularly among the various inﬂected forms of a same word. Frequent ASR errors results from it. In this context, it should be very interesting to have a method to correct such errors. In the literature, we can ﬁnd propositions to repair ASR errors or make the applications robust to these errors [2, 3]. Usually, propositions to repair ASR errors tend to be general and try to repair every kind of errors [4]. They can also take into consideration some particularities of the focused applica- tion, for example the dialog history [4]. In this paper, we propose a different approach consisting in building a speciﬁc correction solution for each speciﬁc er- ror. A similar approach was proposed in [5] for use inside an ASR decoder to handle different language models. Here, we propose to use it in order to repair some errors by post- processing the ASR output. This approach is very pragmatic: it consists in manually analyzing the most frequent errors, particularly the most fre- quent confusion pairs. Errors on different words can either be treated as a group, or be processed confusion pair by confu- sion pair. The various solutions proposed for the various kinds of errors, can use heterogeneous tools such as formal rules or stochastic methods. These tools can be built from heteroge- neous data, such as linguistic knowledge but also acoustic in- formation provided by the ASR system. The use of acoustic information to repair some ASR errors at a post-processing level is another contribution of this paper. This paper focuses on the French language. Two case studies are presented, consisting in correcting speciﬁc er- rors caused two particularities of French. The next section presents some peculiarities of French. Then the proposed approach is detailed in section 3, before the presentation of the tools used during this work. Finally, experiments are described and results are presented. 2. SOME PECULIARITIES OF FRENCH The correspondence of gender, number (and/or person) is one of the most difﬁcult aspects of the French language. French is an inﬂected language. A great difﬁculty for ASR systems (and for some people) is that in a lot of cases, the various in- 213 978-1-4244-3472-5/08/$25.00 ©2008 IEEE SLT 2008