SSRG International Journal of Electrical and Electronics Engineering Volume 11 Issue 12, 375-385, December 2024
ISSN: 2348-8379/ https://doi.org/10.14445/23488379/IJEEE-V11I12P134 © 2024 Seventh Sense Research Group
®
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Original Article
Morphological and Syntactic Challenges in Malayalam:
A Dependency Parsing Perspective
P.V. Ajusha
1
, A.P. Ajees
2
1
School of Information Science and Technology, Kannur University, Kerala, India.
2
Department of Computer Science, Cochin University of Science and Technology, Kerala, India.
1
Corresponding Author : ajusha321@outlook.com
Received: 19 October 2024 Revised: 20 November 2024 Accepted: 18 December 2024 Published: 31 December 2024
Abstract – Natural language processing is the area of study that focuses on how computers and human languages interact.
Machine translation, sentiment analysis, semantic analysis, and text analysis are a few of them. The key natural language
processing component is morphological analysis, which breaks words into their corresponding morphemes to determine their
structure and meaning. Dependency parsing algorithms use morphological information to determine the syntactic structure of a
sentence. This study evaluates the performance of various parsers, including Turbo parser, Lys-FASTPARSE, UU parser, and
neural network based parser, to analyse dependency parsing methodologies used in the Malayalam language. The study
evaluates the performance of these parsers in handling the difficulties and effectiveness of extensive morphological and syntactic
features of Malayalam. Among these parsers, Lys-FASTPARSE performs better in LAS F1 score, MLAS score, and BLEX score,
maintaining values of 56.60 and 48.58 before and after optimization. The neural network parser shows minor improvements in
unlabelled attachment scores from 0.72 to 0.73 and labelled attachment scores from 0.46 to 0.47. With an LAS of 66.89% and
UAS of 87.12%, the Turbo parser shows better results for baseline performance. The precision of 98.81% and recall of 88.42%
in binned HEAD directions of the UU parser shows its performance in managing right direction dependencies. While lower, the
parser's performance in managing left-direction and root dependencies still reflects its ability to navigate complex syntactic
structures effectively. The results underscore the significance of tailored parsing techniques for morphologically rich languages
like Malayalam and provide insights into optimizing parser performance for improved syntactic analysis.
Keywords - Neural network-based parser, Dependency parsing, Lys-FAST parser, UU parser, Transition based parsing.
1. Introduction
Malayalam belongs to the South Dravidian language
family and is an agglutinative language with rich inflectional
morphology. Dependency parsing is a fundamental task in
Natural Language Processing (NLP) that involves analyzing
the grammatical structure of a sentence by identifying the
relationships between words. In dependency parsing, the
syntactic structure of a sentence is represented as a tree where
each word is connected to a "head" word, forming a directed
relationship known as a dependency [1]. The goal is to
determine which words depend on others and the nature of
these dependencies, such as subject-verb or object-verb
relationships. The resulting dependency tree provides a
compact and informative representation of the sentence's
syntactic structure. Unlike phrase structure parsing, which
represents sentence structure using nested phrases,
dependency parsing focuses on binary relations between
words. This makes it particularly useful for free or flexible
word order languages, where dependencies between words are
more informative than their linear sequence [2]. Dependency
parse trees can be divided into projective and non-projective
trees [3]. Figure 1 illustrates the structure of a sentence as
represented by a dependency graph in projective dependency
parsing. In the projective trees the edges do not cross each
other and a word and its dependents can form a substring of
the sentence, but in non-projective trees, there are crossing
edges. Non-projective transition-based parsing has been
actively explored in the last decade. Figure 2 depicts a
dependency graph in non-projective dependency parsing,
where syntactic dependencies between words can cross over
each other, reflecting more complex sentence structures.
The success of neural networks and word embeddings
for projective dependency parsing also encouraged research
on neural nonprojective models 4]. In a projective dependency
tree, every subtree's yield is a contiguous sentence substring.
Identifying tagging issues and problems in annotation are
closely connected to dependency parsing in several ways.
These are crucial for improving the accuracy and efficiency of
parsing systems. Dependency parsing relies heavily on
accurate Part-of-Speech (POS) tags to determine the syntactic
structure of sentences [5].