Journal of Information Technology Research, 5(4), 85-98, October-December 2012 85
Copyright © 2012, IGI Global. Copying or distributing in print or electronic forms without written permission of IGI Global is prohibited.
Keywords: Formal Concept Analysis, Human Machine Interface, Morpheme Analysis, Natural Language
Processing, Parser Modules
INTRODUCTION
Natural Language Processing (NLP) is an active
area within human-machine interface develop-
ment. The processing of input sentences given
in human language or generating sentences
of human language is still a challenging task
in IT world. There are many problem areas in
NLP where no standard solutions are available
for every related task. The input sentences are
processed in many different phases, where the
usual process includes tokenization, cleaning,
morpheme analysis, sentence analysis, semantic
graph construction and sentence interpretation.
The goal of the morpheme analysis module is to
determine the stem of the word and to determine
the grammatical role of the word within the
sentence. The stem can be used to determine
the concept related to the given word. Using
some external ontology, the domain specific and
universal knowledge elements can be extracted
from the related external knowledge base. The
ontology databases usually contain information
on the specific relationships of the concepts like
specialization, generalization, synonyms and
specific application. The grammatical role of the
words can be encoded on many ways. In some
languages, the position of the word conveys the
grammatical role. In some other languages, there
is no dominant word order, thus other formal
elements, like suffixes or prefixes are used to
describe the role of the word. As a word may
have several grammatical and semantic roles
at the same time, several suffix or prefix parts
can be attached to the stem word. The main
goal of the morpheme analyzer module is to
Classifcation Method for
Learning Morpheme Analysis
László Kovács, Department of Information Technology, University of Miskolc, Miskolc City,
Hungary
ABSTRACT
The morpheme analysis module is an important component in natural language processing engines. The
parser modules are usually based on rule systems created by human experts. In the paper, a novel approach
is tested for implementation of the morpheme analyzer module. The proposed structure is based on the theory
of formal concept analysis. The word infection can be considered as a classifcation problem, where the class
label denotes the corresponding transformation rule. The main beneft of the proposed method is the effcient
generalization feature. The proposed morpheme analyzer module was implemented in a prototype question
generation application.
DOI: 10.4018/jitr.2012100106