I.J. Modern Education and Computer Science, 2016, 11, 8-19 Published Online November 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijmecs.2016.11.02 Copyright © 2016 MECS I.J. Modern Education and Computer Science, 2016, 11, 8-19 Development of an English to Yorùbá Machine Translator Safiriyu I. Eludiora Obafemi Awolowo University, Department of Computer Science & Engineering, Ile-Ife, 220005, Nigeria. Email: sieludiora@oauife.edu.ng or safiriyue@yahoo.com Odetunji A. Odejobi Obafemi Awolowo University, Department of Computer Science & Engineering, Ile-Ife, 220005, Nigeria. Email: oodejobi@oauife.edu.ng or oodejobi@yahoo.com Abstract—The study formulated a computational model for English to Yorùbá text translation process. The modelled translation process was designed, implemented and evaluated. This was with a view to addressing the challenge of English to Yorùbá text machine translator. This machine translator can translate modify and non- modify simple sentences (subject verb object (SVO)). Digital resources in English and its equivalence in Yorùbá were collected using the home domain terminologies and lexical corpus construction techniques. The English to Yorùbá translation process was modelled using phrase structure grammar and re-write rules. The re-write rules were designed and tested using Natural Language Tool Kits (NLTKs). Parse tree and Automata theory based techniques were used to analyse the formulated model. Unified Modeling Language (UML) was used for the software design. The Python programming language and PyQt4 tools were used to implement the model. The developed machine translator was tested with simple sentences. The results for the Basic Subject-Verb-Object (BSVO) and Modified SVO (MSVO) sentences translation show that the total Experimental Subject Respondents (ESRs), machine translator and human expert average scores for word syllable, word orthography, and sentence syntax accuracies were 66.7 percent, 82.3 percent, and 100 percent, respectively. The system translation accuracies were close to a human expert. Index Terms—Yorùbá Language, simple sentences, orthography, experimental subject respondents, Human Expert, Africa I. INTRODUCTION Yorùbá is one of the major languages spoken in Africa. Other languages in this category include Fulfude, Hausa, Lingala, Swahili, and Zulu. Yorùbá has a speaker population of about 30 million (South West Nigeria only) according to 2006 population census conducted by the National Population Commission of Nigeria [1]. Yorùbá language has many dialects but all speakers can communicate effectively using the standard Yorùbá (SY) which is the language of education, mass media, and everyday communication [2]. Yorùbá is a tonal language with three phonological contrastive tones: High (H), Mid (M) and Low (L). Phonetically, however, there are two additional allotones or tone variances namely, rising (R) and falling (F) [3] and [4]. The Yorùbá alphabet has twenty-five letters comprising eighteen consonants and seven vowels. There are five nasalised vowels in the language and two pure syllabic nasal vowels [3] and [5]. Yorùbá has a well-established orthography which has been in use for over ten decades (around 1843). Yorùbá is relatively well studied when compared with other African languages and there is literature on the grammar of the language. The present work is one of the works that have examined the machine translation systems in the context of the text to text translation technology. A. Machine Translation Evaluation Techniques Machine translation systems output can be evaluated by considering numerous dimensions: the intended use of the translation, characteristics of the MT software and the nature of the translation process. There are various means for evaluating the performance of machine translation systems. The oldest is the use of human judges to assess a translation's quality. Though human evaluation is time- consuming, it is still the most reliable way to compare different MT systems developed using different translation approaches such as rule-based and statistical approaches. Automated means of evaluation include Bilingual Evaluation Understudy (BLEU), National Institute of Standards and Technology (NIST) and Metric for Evaluation of Translation with Explicit Ordering (METEOR) [6]. “Reference [7]” explains that “machine translation at its best automates the easier part of a translator's job, the harder, and more time-consuming part usually involves doing extensive research to resolve ambiguities in the source text, which the grammatical and lexical exigencies of the target language require to be resolved”. Such research is a prelude to the pre-editing in order to provide input for machine-translation software such that the output is not meaningless.