Automated Identification of Metaphors in Annotated Corpus (Based on Substance Terms) Olena Levchenko, Oleh Tyshchenko, and Marianna Dilai Lviv Polytechnic National University, Bandera Str., 12, Lviv, 79000, Ukraine Abstract The automatic or automated metaphor identification remains a challenging problem. The methods proposed so far have been mostly developed for the English language and can be roughly divided into two groups: intended for annotated and non-annotated corpora. In addition, neural networks are used. It should also be noted that the application of recently developed methods for measuring the degree of semantic association of collocation components (T-score, MI, logDice, etc.) fails to detect metaphorical expressions. Previously, we presented a method of automated identification of metaphorical expressions (adjective + noun) for non-annotated corpora of Ukrainian prose texts, based on the analysis of dictionary definitions. This paper describes a method of automated identification of metaphors in the semantically annotated corpus of texts. This algorithm is based on the theoretical propositions and readings of metaphor within the framework of Conceptual Metaphor Theory. The methodology contains an empirical stage at which structural-semantic models of metaphors are detected and classified based on the semantic category of the words in the right-hand position. The performance analysis and the evaluation of the method’s effectiveness are presented. Keywords 1 Metaphor, annotated corpus, substance nouns, automated identification of metaphor 1. Introduction The automatic/automated metaphor identification still remains a challenging problem. The methods introduced so far have been mostly developed for the English language and divided into the methods designed for semantically annotated, metaphorically annotated and non-annotated corpora. A detailed analysis of the approaches used today is presented in [1, 2, 3, 4] and others. It should be noted that different methods of automated metaphor identification are based on different theoretical readings of metaphor; however, the most modern approaches are grounded on the Conceptual Metaphor Theory [5, 6]. Given various interpretations of metaphor, researchers use different terminology: promising metaphorical words[1]; aspect words, abstractness of the aspect words [7, 8] and others. VUAMC corpus is an example of a metaphorically annotated corpus of the English language, which is annotated applying the MIPVU methodology (Metaphor Identification Procedure Vrije Universiteit) [9]. This technique includes revealing the basic meaning of the word and then determining the degree of contrast between the basic and contextual meanings. To avoid subjectivism, two or more annotators are involved in this procedure and are to reach an agreed decision [9]. Previously, we developed a method of automated identification of metaphorical expressions (adjective + noun) for non-annotated corpora of Ukrainian prose texts, based on the analysis of dictionary definitions [10]. It has been successfully applied in a number of studies [11, 12], which COLINS-2021: 5th International Conference on Computational Linguistics and Intelligent Systems, April 2223, 2021, Kharkiv, Ukraine EMAIL: levchenko.olena@gmail.com (Olena Levchenko); olkotiszczenko@gmail.com (Oleh Tyshchenko); mariannadilai@gmail.com (Marianna Dilai) ORCID: 0000-0002-7395-3772 (Olena Levchenko); 0000000172552742 (Oleh Tyshchenko); 0000-0001-5182-9220 (Marianna Dilai) © 2021 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org)