INTERNATIONAL JOURNAL OF TRANSLATION Vol. XX, No. XX, XXXX XXXX Automatic Induction of Bilingual Lexicons for Machine Translation HELENA DE M. CASELI MARIA DAS GRAÇAS V. NUNES University of São Paulo, ICMC-NILC, São Carlos, Brazil ABSTRACT Translation lexicons are one of the most important linguistic resources for machine translation. However, this bilingual set of word and multiword correspondences requires a lot of manual work to be built. This paper describes a method to automatically build translation lexicons. The lexicons are built by extracting knowledge from PoS-tagged and lexically aligned parallel corpora. Preliminary experiments were carried out on Brazilian Portuguese, Spanish and English parallel texts. The results of a manual analysis showed that 85% of pt-es and 89% of pt-en entries are plausible correspondences. These results were obtained taking into consideration only the classes of entries which achieved the best results. Target sentences were generated using all induced entries. These sentences were compared with target sentences generated by commercial systems. This comparison emphasizes the relevance of translation lexicons in machine translation, mainly in Portuguese-Spanish. INTRODUCTION Two of the main challenges of machine translation (MT) and other natural language processing (NLP) applications are (1) the production, maintenance and extension of computational linguistic resources and (2) the integration of these resources into NLP applications. In an attempt to overcome these challenges, several methods have been proposed to automatically build a variety of linguistic resources such as translation grammars (Menezes & Richardson 2001; Lavoie, White & Korelsky 2001; Carbonell et al. 2002) and translation lexicons (Wu & Xia 1994; Fung 1995; Gómez Guinovart & Sacau Fontenla 2004; Koehn & Knight 2002; Langlais, Foster & Lapalme 2001; Schafer & Yarowsky 2002).