109 ICI Bucharest © Copyright 2012-2021. All rights reserved ISSN: 1220-1766 eISSN: 1841-429X 1. Introduction Sentiment analysis has been an important natural language processing task in the last decade, with applications ranging from appraising sentiment in political speeches to identifying consumer attitudes in product reviews. Lately, researchers have focused on analyzing expressions of emotion in text, thus increasing the number of potential applications in felds like psychology and human- computer interaction. Usually, data sets used for the emotion detection task contain tweets or blog entries. However, there is another research avenue that might provide useful insight into emotion expression, and that is the study of emotions in literary texts. In this context, unsupervised learning ofers various models which are efective for mining relevant patterns from text, including emotional patterns. The use of literary texts as data has its advantages, among which the fact that some literary texts are freely available. Moreover, for popular works, there are vast amounts of literary analyses, critic and audience reviews. Therefore, performance of computational models can be assessed using these additional sources, not unimportant in complex tasks such as emotion studies. In addition, the exploration of emotion in literary texts can be relevant to broader studies of emotion, as literature very often encodes representations of complex emotional experiences, very similar to the ones in the real world (Hogan, 2010). The purpose of this study is to investigate the association between thematic and emotional content in a corpus of Romanian poems authored by Mihai Eminescu using an unsupervised learning-based analysis. The main research question asked in this work is whether the considered literary topics can be characterized by specifc emotional patterns in Mihai Eminescu’s poetry. The poetic work of this author was selected because it is a point of reference in Romanian literature, and therefore any fndings may be of interest to scholars of diferent domains. Hierarchical clustering is used, with lexicon- based emotional features engineered with the help of RoEmoLex– Romanian Emotion Lexicon (Lupea & Briciu, 2019) to uncover emotional patterns in Mihai Eminescu’s poetry in an unsupervised manner. Results obtained provide insight into the emotional patterns associated with thematic content. On the one hand, it is found that while the emotional and thematic perspectives are unquestionably tied together, neither allow a clear separation in homogeneous, disjoint groups. On the other hand, change in the poet’s outlook on the proposed topics is characterized by clear shifts in emotional content. The evolution of emotions identifed through this computational method is also described by literary critics, which makes it the main fnding of the present work. The novelty of this approach resides in using emotion-based features in a clustering approach Studies in Informatics and Control, 30(1) 109-118, March 2021 https://doi.org/10.24846/v30i1y202110 Emotion-based Hierarchical Clustering of Romanian Poetry Mihaiela LUPEA 1 , Anamaria BRICIU 1 *, Elena BOSTENARU 2 1 Faculty of Mathematics and Computer Science, Babeș-Bolyai University, 1 Mihail Kogălniceanu Street, Cluj-Napoca, 400084, Romania lupea@cs.ubbcluj.ro, anamaria.briciu@cs.ubbcluj.ro (*Corresponding author) 2 Faculty of Letters, Babeș-Bolyai University, 31 Horea Street, Cluj-Napoca, 400202, Romania elenabostenaru@yahoo.com Abstract: Emotions play a central role in both writing and understanding literary works, and poetry is a genre rich in emotional content, vivid imagery and abstract language. This paper proposes a clustering-based approach to unsupervisedly mine emotional patterns in Mihai Eminescu’s poetry. Lexicon-based emotion features are used for the clustering algorithm. Resulting clusters are assessed with regard to manually added characteristics of poems in the form of literary themes. There is a partial overlap between afective and thematic content, consistent with literary evaluations of the same works. Computational approaches have the advantage of being objective and replicable, with unsupervised techniques such as clustering representing a valuable tool in the exploration of literary works. Nonetheless, no specifc emotional patterns, as determined by the proposed method, can be fully associated with particular literary themes. Keywords: Emotion analysis, Unsupervised learning, Hierarchical clustering, Poetry.