ICEMIS2017, Monastir, Tunisia 978-1-5090-6778-7/17/$31.00 ©2017 IEEE Transformation system to generate derivational forms of an Arabic verb with HPSG Samia Ben Ismail ISITCom Hammam Sousse MIRACL Laboratory Sfax, Tunisia Samia_benismail@yahoo.fr Sirine Boukédi National Engineering School University of Gabes MIRACL Laboratory Sfax, Tunisia sirine.boukedi@gmail.com Kais Haddar Faculty of Science of Sfax MIRACL Laboratory Sfax, Tunisia kais.haddar@yahoo.fr Abstractthe automatic construction of linguistic resources has always been a center of interest in Natural Language Processing (NLP). It automatically constructs, for each word, its diverse morphological forms (i.e., derivational and inflectional), with a minimum number of rules. Indeed, this process facilitates the texts analysis essentially morphological analysis which is fundamental for several applications such as human-machine dialogue and grammatical error correction. In this context, the main objective of this work is to develop a transformation system for Arabic language with Head-driven Phrase Structure Grammar (HPSG) generating derivational forms of each type of Arabic verb. Based on a proposed type hierarchy, the conceived derived forms are specified on Type Description Language (TDL) to validate it with the Linguistic Knowledge Building (LKB) platform. The added rules to LKB generate the majority of the derivational forms and offer reliable results in a short time of execution. As example, we present in this paper, the generation process of derivational forms of an Arabic verb. Keywords— HPSG grammar; LKB morphological analysis; type description language (TDL); derivational form; type hierarchy. I. INTRODUCTION The automatic construction of linguistic resources has always been a center of interest in NLP domain, mainly with HPSG (Head-Driven Phrase Structure Grammar) formalism. Indeed, this process facilitates the texts analysis since analyzing is an important process in NLP domain, essentially the morphological analysis. It allows the automatic treatment for the different type of forms such as inflectional and derivational. Moreover, it enables to transform an intentional lexicon to extensional lexicon. In fact, it is sufficient to enter just the canonical form and all the other forms (inflectional and derivational) will be treated automatically. Yet, this automatic construction is necessary for several applications such as human-machine dialogue and grammatical errors correction. Despite this importance, the literature showed that there exists a lack for Arabic language essentially at the lexical and morphological levels. In this context, some works were interested mainly on syntactic aspect and other treated some morphological forms of Arabic word. However, this treatment was generally incomplete principally at the morphology level (inflectional and derivational) of Arabic language. This is due to the presence of many problems encountered in linguistic processing. Among these problems, the main difficulty that was frequently prevalent is to find the adequate classification of lexical entries. Indeed, the inflection and the derivation terms are grammatically different and require important treatments on each category of words (i.e. verb and noun). In fact, each category possesses a diversity of regular and irregular forms. After the problem of classification, another problem appears in the optimization aspect. In fact, the automatic construction of various linguistic resources must be with a minimum number of rules. In this context, our first contribution is to find an adequate type hierarchy classifying different derivational forms can have an Arabic word. Based on the proposed type of hierarchy, the second objective is to construct an Arabic HPSG grammar treating the derivational forms. This formalism is based on a set of principles, essentially the inheritance which favorites the optimization during on the rule’s construction. Another contribution of the present work is to validate the proposed grammar by LKB platform that used reliable algorithms and which are generally experimented by experts. In this paper, we are interested in the derivational forms of Arabic verbs. Therefore, we begin by presenting some previous works. Then, we describe the proposed type hierarchy to categorize the Arabic verb. According to this hierarchy, we present the HPSG representation and the TDL specification for elaborated Arabic grammar. Then, we evaluate this grammar with LKB system. Finally, we enclose our paper with a conclusion and some perspectives. II. PREVIOUS WORK The study of previous works showed that researchers constructing Arabic grammar essentially morphological phenomena are not numerous. Indeed, for HPSG formalism, according to our literature, there exist some Arabic works treating morphological phenomena such as [1] and [2]. On the one hand, in [1] Islam proposes a new HPSG representation for Arabic nominal’s and various verb-derived nouns. He has captured the morphology of the Arabic verbal noun by expanding the MORPH, SYN and SEM features. This work treats verbal noun from trilateral non-sound Form I verb and the analysis of verbal nouns based on quadrilateral verbs. On the other hand, in [2] Alqurashi proposes an HPSG analysis for simple and construct-state noun phrases in Modern Standard Arabic (MSA). He has provided an account of the definite and indefinite affixes. He has outlined three analyses within HPSG and finally he was adapted the third analysis. In fact, this analysis proposed that nouns appear in head-adjunct