82 Int. J. Metadata, Semantics and Ontologies, Vol. 12, Nos. 2/3, 2017
Copyright © 2017 Inderscience Enterprises Ltd.
Psychological named entity recognition from
psychological Arabic texts
Kheira Lakel* and Fatima Bendella
Department of Computer Science,
Faculty of Science,
USTO University,
BP 1505, 31000, Algeria
Email: kheira.lakel@univ-usto.dz
Email: fatima.bendella@univ-usto.dz
*Corresponding author
Abstract: The most important problems facing the Arabisation of modern science is the
terminological inconsistency in translation; this problem becomes more complex in the
medical field specifically in psychological sciences where the translation of English–Arabic
medical terms poses real challenges for researchers eager to analyse and organise this
information. Arabic NER (Named Entity Recognition) systems play a significant role in many
areas of Natural Language Processing (NLP). In this paper, the problem of PsyNER
(Psychological Named Entity Recognition) is tackled through integrating the rule-based and
machine learning based approach to form a hybrid approach in attempt to enhance the overall
performance of PsyNER. This system is capable to recognise eight types of named entities
including mental disorders designated by the DSM-IV (Diagnostic and Statistical Manual of the
American Psychiatric Association).
Keywords: NERA; named entity recognition; psychological sciences; Arabic language; Jape;
gazetteers; GATE.
Reference to this paper should be made as follows: Lakel, K. and Bendella, F. (2017)
‘Psychological named entity recognition from psychological Arabic texts’, Int. J. Metadata,
Semantics and Ontologies, Vol. 12, Nos. 2/3, pp.82–89.
Biographical notes: Kheira Lakel is a PhD student at USTO University. Her PhD topic is about
Ontologies, Semantic Web and Natural Language Processing. She has published research papers
at national and international conference proceedings.
Fatima Bendella is a Professor at the Computer Science Department of USTO University
(Algeria). Her research interests are related to MAS (Multi-Agent System), Natural Language
Processing and Serious Game. She has published research papers at national and international
journals, conference proceedings as well as chapters of books.
1 Introduction
During the Islamic era in 7th century, Arabic medicine and
pharmacology reached their peak, more specifically during
the Umayyad and Abbasside periods, when movements
of translation into Arabic flourished, followed by a period
of Arabic contributions. The history of Arabic medicine
extended from the 8th century when Arab intellectualists
started to appear and multiple sciences began to emerge
eastward. This beacon of sciences remained there until the
beginning of the 13th century. The history of Arabic
medicine can be divided into three main stages; the age of
translation, the age of Arabic original contribution, and the
age of decline and transmission to Europe (Najjar, 2010).
Arabic is one of the six official languages of the United
Nations. It is spoken by 300 million people in the
23 countries of the Arab world. It is one of the
Semitic languages. The treatment of the Arab language is
considered complex because structural and morphological
characteristics such as inflection, polysemy and irregular
word forms; and for various reasons, today Arab medicine
suffers from the terminological inconsistency and several
attempts to put the official Arabic language in several
medical institutes have proved futile.
Another evidence of the lack of Arabic in the Semantic
Web world is a recent statistic provided by the (OntoSelect)
ontology library (Buitelaar and Eigner, 2008), which shows
that 71% of the ontologies with labels in the library are
created in English. This problem could be attributed to the
lack of tools and software development environments that
process Arabic script in all steps of the semantic annotation
process.