ICEMIS2017, Monastir, Tunisia
978-1-5090-6778-7/17/$31.00 ©2017 IEEE
Transformation system to generate derivational forms
of an Arabic verb with HPSG
Samia Ben Ismail
ISITCom Hammam Sousse
MIRACL Laboratory
Sfax, Tunisia
Samia_benismail@yahoo.fr
Sirine Boukédi
National Engineering School University of Gabes
MIRACL Laboratory
Sfax, Tunisia
sirine.boukedi@gmail.com
Kais Haddar
Faculty of Science of Sfax
MIRACL Laboratory
Sfax, Tunisia
kais.haddar@yahoo.fr
Abstract— the automatic construction of linguistic resources has
always been a center of interest in Natural Language Processing
(NLP). It automatically constructs, for each word, its diverse
morphological forms (i.e., derivational and inflectional), with a
minimum number of rules. Indeed, this process facilitates the texts
analysis essentially morphological analysis which is fundamental
for several applications such as human-machine dialogue and
grammatical error correction. In this context, the main objective of
this work is to develop a transformation system for Arabic
language with Head-driven Phrase Structure Grammar (HPSG)
generating derivational forms of each type of Arabic verb. Based on
a proposed type hierarchy, the conceived derived forms are
specified on Type Description Language (TDL) to validate it with
the Linguistic Knowledge Building (LKB) platform. The added
rules to LKB generate the majority of the derivational forms and
offer reliable results in a short time of execution. As example, we
present in this paper, the generation process of derivational forms
of an Arabic verb.
Keywords— HPSG grammar; LKB morphological analysis; type
description language (TDL); derivational form; type hierarchy.
I. INTRODUCTION
The automatic construction of linguistic resources has
always been a center of interest in NLP domain, mainly with
HPSG (Head-Driven Phrase Structure Grammar) formalism.
Indeed, this process facilitates the texts analysis since analyzing
is an important process in NLP domain, essentially the
morphological analysis. It allows the automatic treatment for the
different type of forms such as inflectional and derivational.
Moreover, it enables to transform an intentional lexicon to
extensional lexicon. In fact, it is sufficient to enter just the
canonical form and all the other forms (inflectional and
derivational) will be treated automatically. Yet, this automatic
construction is necessary for several applications such as
human-machine dialogue and grammatical errors correction.
Despite this importance, the literature showed that there
exists a lack for Arabic language essentially at the lexical and
morphological levels. In this context, some works were
interested mainly on syntactic aspect and other treated some
morphological forms of Arabic word. However, this treatment
was generally incomplete principally at the morphology level
(inflectional and derivational) of Arabic language. This is due to
the presence of many problems encountered in linguistic
processing. Among these problems, the main difficulty that was
frequently prevalent is to find the adequate classification of
lexical entries. Indeed, the inflection and the derivation terms are
grammatically different and require important treatments on
each category of words (i.e. verb and noun). In fact, each
category possesses a diversity of regular and irregular forms.
After the problem of classification, another problem appears in
the optimization aspect. In fact, the automatic construction of
various linguistic resources must be with a minimum number of
rules.
In this context, our first contribution is to find an adequate
type hierarchy classifying different derivational forms can have
an Arabic word. Based on the proposed type of hierarchy, the
second objective is to construct an Arabic HPSG grammar
treating the derivational forms. This formalism is based on a set
of principles, essentially the inheritance which favorites the
optimization during on the rule’s construction. Another
contribution of the present work is to validate the proposed
grammar by LKB platform that used reliable algorithms and
which are generally experimented by experts.
In this paper, we are interested in the derivational forms of
Arabic verbs. Therefore, we begin by presenting some previous
works. Then, we describe the proposed type hierarchy to
categorize the Arabic verb. According to this hierarchy, we
present the HPSG representation and the TDL specification for
elaborated Arabic grammar. Then, we evaluate this grammar
with LKB system. Finally, we enclose our paper with a
conclusion and some perspectives.
II. PREVIOUS WORK
The study of previous works showed that researchers
constructing Arabic grammar essentially morphological
phenomena are not numerous. Indeed, for HPSG formalism,
according to our literature, there exist some Arabic works
treating morphological phenomena such as [1] and [2].
On the one hand, in [1] Islam proposes a new HPSG
representation for Arabic nominal’s and various verb-derived
nouns. He has captured the morphology of the Arabic verbal
noun by expanding the MORPH, SYN and SEM features. This
work treats verbal noun from trilateral non-sound Form I verb
and the analysis of verbal nouns based on quadrilateral verbs.
On the other hand, in [2] Alqurashi proposes an HPSG analysis
for simple and construct-state noun phrases in Modern Standard
Arabic (MSA). He has provided an account of the definite and
indefinite affixes. He has outlined three analyses within HPSG
and finally he was adapted the third analysis. In fact, this
analysis proposed that nouns appear in head-adjunct