109
ICI Bucharest © Copyright 2012-2021. All rights reserved
ISSN: 1220-1766 eISSN: 1841-429X
1. Introduction
Sentiment analysis has been an important natural
language processing task in the last decade, with
applications ranging from appraising sentiment
in political speeches to identifying consumer
attitudes in product reviews. Lately, researchers
have focused on analyzing expressions of emotion
in text, thus increasing the number of potential
applications in felds like psychology and human-
computer interaction. Usually, data sets used for
the emotion detection task contain tweets or blog
entries. However, there is another research avenue
that might provide useful insight into emotion
expression, and that is the study of emotions
in literary texts. In this context, unsupervised
learning ofers various models which are efective
for mining relevant patterns from text, including
emotional patterns.
The use of literary texts as data has its advantages,
among which the fact that some literary texts are
freely available. Moreover, for popular works,
there are vast amounts of literary analyses, critic
and audience reviews. Therefore, performance of
computational models can be assessed using these
additional sources, not unimportant in complex
tasks such as emotion studies. In addition, the
exploration of emotion in literary texts can be
relevant to broader studies of emotion, as literature
very often encodes representations of complex
emotional experiences, very similar to the ones
in the real world (Hogan, 2010).
The purpose of this study is to investigate the
association between thematic and emotional
content in a corpus of Romanian poems authored
by Mihai Eminescu using an unsupervised
learning-based analysis. The main research
question asked in this work is whether the
considered literary topics can be characterized by
specifc emotional patterns in Mihai Eminescu’s
poetry. The poetic work of this author was
selected because it is a point of reference in
Romanian literature, and therefore any fndings
may be of interest to scholars of diferent domains.
Hierarchical clustering is used, with lexicon-
based emotional features engineered with the
help of RoEmoLex– Romanian Emotion Lexicon
(Lupea & Briciu, 2019) to uncover emotional
patterns in Mihai Eminescu’s poetry in an
unsupervised manner. Results obtained provide
insight into the emotional patterns associated with
thematic content. On the one hand, it is found that
while the emotional and thematic perspectives are
unquestionably tied together, neither allow a clear
separation in homogeneous, disjoint groups. On
the other hand, change in the poet’s outlook on
the proposed topics is characterized by clear shifts
in emotional content. The evolution of emotions
identifed through this computational method is
also described by literary critics, which makes it
the main fnding of the present work.
The novelty of this approach resides in using
emotion-based features in a clustering approach
Studies in Informatics and Control, 30(1) 109-118, March 2021
https://doi.org/10.24846/v30i1y202110
Emotion-based Hierarchical Clustering
of Romanian Poetry
Mihaiela LUPEA
1
, Anamaria BRICIU
1
*, Elena BOSTENARU
2
1
Faculty of Mathematics and Computer Science, Babeș-Bolyai University,
1 Mihail Kogălniceanu Street, Cluj-Napoca, 400084, Romania
lupea@cs.ubbcluj.ro, anamaria.briciu@cs.ubbcluj.ro (*Corresponding author)
2
Faculty of Letters, Babeș-Bolyai University, 31 Horea Street, Cluj-Napoca, 400202, Romania
elenabostenaru@yahoo.com
Abstract: Emotions play a central role in both writing and understanding literary works, and poetry is a genre rich in
emotional content, vivid imagery and abstract language. This paper proposes a clustering-based approach to unsupervisedly
mine emotional patterns in Mihai Eminescu’s poetry. Lexicon-based emotion features are used for the clustering algorithm.
Resulting clusters are assessed with regard to manually added characteristics of poems in the form of literary themes.
There is a partial overlap between afective and thematic content, consistent with literary evaluations of the same works.
Computational approaches have the advantage of being objective and replicable, with unsupervised techniques such as
clustering representing a valuable tool in the exploration of literary works. Nonetheless, no specifc emotional patterns, as
determined by the proposed method, can be fully associated with particular literary themes.
Keywords: Emotion analysis, Unsupervised learning, Hierarchical clustering, Poetry.