I.J. Modern Education and Computer Science, 2016, 11, 8-19
Published Online November 2016 in MECS (http://www.mecs-press.org/)
DOI: 10.5815/ijmecs.2016.11.02
Copyright © 2016 MECS I.J. Modern Education and Computer Science, 2016, 11, 8-19
Development of an English to Yorùbá Machine
Translator
Safiriyu I. Eludiora
Obafemi Awolowo University, Department of Computer Science & Engineering, Ile-Ife, 220005, Nigeria.
Email: sieludiora@oauife.edu.ng or safiriyue@yahoo.com
Odetunji A. Odejobi
Obafemi Awolowo University, Department of Computer Science & Engineering, Ile-Ife, 220005, Nigeria.
Email: oodejobi@oauife.edu.ng or oodejobi@yahoo.com
Abstract—The study formulated a computational model
for English to Yorùbá text translation process. The
modelled translation process was designed, implemented
and evaluated. This was with a view to addressing the
challenge of English to Yorùbá text machine translator.
This machine translator can translate modify and non-
modify simple sentences (subject verb object (SVO)).
Digital resources in English and its equivalence in
Yorùbá were collected using the home domain
terminologies and lexical corpus construction techniques.
The English to Yorùbá translation process was modelled
using phrase structure grammar and re-write rules. The
re-write rules were designed and tested using Natural
Language Tool Kits (NLTKs). Parse tree and Automata
theory based techniques were used to analyse the
formulated model. Unified Modeling Language (UML)
was used for the software design. The Python
programming language and PyQt4 tools were used to
implement the model. The developed machine translator
was tested with simple sentences. The results for the
Basic Subject-Verb-Object (BSVO) and Modified SVO
(MSVO) sentences translation show that the total
Experimental Subject Respondents (ESRs), machine
translator and human expert average scores for word
syllable, word orthography, and sentence syntax
accuracies were 66.7 percent, 82.3 percent, and 100
percent, respectively. The system translation accuracies
were close to a human expert.
Index Terms—Yorùbá Language, simple sentences,
orthography, experimental subject respondents, Human
Expert, Africa
I. INTRODUCTION
Yorùbá is one of the major languages spoken in Africa.
Other languages in this category include Fulfude, Hausa,
Lingala, Swahili, and Zulu. Yorùbá has a speaker
population of about 30 million (South West Nigeria only)
according to 2006 population census conducted by the
National Population Commission of Nigeria [1]. Yorùbá
language has many dialects but all speakers can
communicate effectively using the standard Yorùbá (SY)
which is the language of education, mass media, and
everyday communication [2].
Yorùbá is a tonal language with three phonological
contrastive tones: High (H), Mid (M) and Low (L).
Phonetically, however, there are two additional allotones
or tone variances namely, rising (R) and falling (F) [3]
and [4]. The Yorùbá alphabet has twenty-five letters
comprising eighteen consonants and seven vowels. There
are five nasalised vowels in the language and two pure
syllabic nasal vowels [3] and [5].
Yorùbá has a well-established orthography which has
been in use for over ten decades (around 1843). Yorùbá is
relatively well studied when compared with other African
languages and there is literature on the grammar of the
language. The present work is one of the works that have
examined the machine translation systems in the context
of the text to text translation technology.
A. Machine Translation Evaluation Techniques
Machine translation systems output can be evaluated
by considering numerous dimensions: the intended use of
the translation, characteristics of the MT software and the
nature of the translation process. There are various means
for evaluating the performance of machine translation
systems. The oldest is the use of human judges to assess
a translation's quality. Though human evaluation is time-
consuming, it is still the most reliable way to compare
different MT systems developed using different
translation approaches such as rule-based and statistical
approaches. Automated means of evaluation include
Bilingual Evaluation Understudy (BLEU), National
Institute of Standards and Technology (NIST) and Metric
for Evaluation of Translation with Explicit Ordering
(METEOR) [6].
“Reference [7]” explains that “machine translation at
its best automates the easier part of a translator's job, the
harder, and more time-consuming part usually involves
doing extensive research to resolve ambiguities in the
source text, which the grammatical and lexical exigencies
of the target language require to be resolved”. Such
research is a prelude to the pre-editing in order to provide
input for machine-translation software such that the
output is not meaningless.