MACHINE TRANSLATION FROM ENGLISH TO ARABIC Mouiad Alawneh, Nazlia Omar and Tengku Mohd Sembok Faculty of Information Science and Technology, National University of Malaysia, Bangi , 43600,Malaysia National University of Malaysia (UKM), National University of Malaysia (UKM), m_maradona86@yahoo.com Abstract. Machine Translation has been defined as the process that utilizes computer software to translate text from one natural language to another. This definition involves accounting for the grammatical structure of each language and using rules, examples and grammars to transfer the grammatical structure of the source language (SL) into the target language (TL). This paper presents English to Arabic approach for translating well-structured English sentences into well-structured Arabic sentences, using a Grammar-based and example-translation techniques to handle the problems of ordering and agreement. The proposed methodology is flexible and scalable, the main advantages are: first, a hybrid-based approach combined advantages of rule-based (RBMT) with advantages example-based (EBMT), and second, it can be applied on some other languages with minor modifications. The OAK Parser is used to analyze the input English text to get the part of speech (POS) for each word in the text as a pre-translation process using the C# language, validation rules have been applied in both the database design and the programming code in order to ensure the integrity of data. A major design goal of this system is that it will be used as a stand-alone tool, and can be very well integrated with a general machine translation system for English sentences. Keywords: MT, Agreement, Word reorder, Rule-Based, Example-based, Hybrid-based OAK, Parser, POS 1. Introduction The current Machine Translation system facilitates the end user to understand the English textual sentences clearly by generating the precise corresponding Arabic language. Agreement is a basic property of language. In the most basic sense, agreement occurs when two elements in the appropriate configuration exhibit morphology consistent with their co-occurrence. Perhaps the most transparent case of this linguistic mechanism is number agreement between a subject and a verb: A singular noun in the subject position regularly co-occurs with a singular verb (e.g., “the dog runs”), and a plural subject noun regularly co-occurs with a plural verb (e.g., “the dogs run”). If the language has number marking on other elements, such as determiners or adjectives, these should also exhibit morphology that is consistent with their relationship to the subject head noun, and this co-occurrence relationship holds for gender and person agreement as well. The modern Arabic dialects are well-known as having agreement asymmetries that are sensitive to word order effects. These asymmetries have been attributed to a variety of causes, first, by the analysis problems at the source language, second, the generation problems at the target languages. However, Arabic is not alone in showing word-order asymmetries for agreement, Similar asymmetries have been documented in Russian, Hindi, Slovene, French and Italian (Hutchins and Somers 1992). Languages are varied in the agreement requirements. Some of them like Arabic require number, gender, person, and case agreements while others need some of these agreements. Machine translation system develops by using four approaches depending on their difficulty and complexity. These approaches are: rule based, knowledge-based, corpus-based and hybrid MT, Rule-based machine translation approaches can be classified into the following categories: direct machine translation, interlingua machine translation and transfer based machine translation (Abu Shquier and Sembok, 2008).Our purpose of this paper is to design a hybrid-based (rule-based and example-based) framework based hence, to strike a balance between both approaches in the use of MT for the translation of 2011 International Conference on Biomedical Engineering and Technology IPCBEE vol.11 (2011) © (2011) IACSIT Press, Singapore 95