Proceedings of the Second Workshop on Figurative Language Processing, pages 116–125 July 9, 2020. c 2020 Association for Computational Linguistics https://doi.org/10.18653/v1/P17 116 Character aware models with similarity learning for metaphor detection Tarun Kumar Department of CSIS Birla Institute of Technology and Science, Pilani, India f2016005@pilani.bits-pilani.ac.in Yashvardhan Sharma Department of CSIS Birla Institute of Technology and Science, Pilani, India yash@pilani.bits-pilani.ac.in Abstract Recent work on automatic sequential metaphor detection has involved recurrent neural networks initialized with different pre-trained word embeddings and which are sometimes combined with hand engineered features. To capture lexical and orthographic information automatically, in this paper we propose to add character based word representation. Also, to contrast the difference between literal and contextual meaning, we utilize a similarity network. We explore these components via two different architectures - a BiLSTM model and a Transformer Encoder model similar to BERT to perform metaphor identification. We participate in the Second Shared Task on Metaphor Detection on both the VUA and TOFEL datasets with the above models. The experimental results demonstrate the effectiveness of our method as it outper- forms all the systems which participated in the previous shared task. 1 Introduction Metaphors are an inherent component of natural language and enrich our day-to-day communication both in verbal and written forms. A metaphoric ex- pression involves the use of one domain or concept to explain or represent another concept (Lakoff and Johnson, 1980). Detecting metaphors is a crucial step in interpreting semantic information and thus building better representations for natural language understanding (Shutova and Teufel, 2010). This is beneficial for applications which require to infer the literal/metaphorical usage of words such as in- formation extraction, conversational systems and sentiment analysis (Tsvetkov et al., 2014). The detection of metaphorical usage is not a triv- ial task. For example, in phrases such as breaking the habit and absorption of knowledge, the words breaking and absorption are used metaphorically to mean to destroy/end and understand/learn re- spectively. In the phrase, All the world’s a stage, the world (abstract) has been portrayed in a more concrete (stage) sense. Thus, computational ap- proaches to metaphor identification need to exploit world knowledge, context and domain understand- ing (Tsvetkov et al., 2014). A number of approaches to metaphor detection have been proposed in the last decade. Many of them use explicit hand-engineered lexical and syn- tactic information (Hovy et al., 2013; Klebanov et al., 2016), higher level features such as con- creteness scores (Turney et al., 2011; K ¨ oper and Schulte im Walde, 2017) and WordNet supersenses (Tsvetkov et al., 2014). The more recent methods have modeled metaphor detection as a sequence la- beling task, and hence have used BiLSTM (Graves and Schmidhuber, 2005) in different ways (Wu et al., 2018; Gao et al., 2018; Mao et al., 2019; Bizzoni and Ghanimifard, 2018). In this paper, we use concatenation of GloVe (Pennington et al., 2014) and ELMo (Peters et al., 2018) vectors augmented with character level fea- tures using CNN and highway network (Kim et al., 2016; Srivastava et al., 2015). Such a method of combining pre-trained embeddings with charac- ter level representations has been previously used in several sequence tagging tasks - part-of-speech (POS) tagging (Ma and Hovy, 2016) and named entity recognition (NER) (Chiu and Nichols, 2016), question answering (Seo et al., 2016) and multi- task learning (Sanh et al., 2019). This inspires us to explore similar setting for metaphor identification as well. We propose two models for metaphor detection 1 with the input prepared as above - a vanilla BiL- STM model and a vanilla Transformer Encoder (Vaswani et al., 2017) model similar to BERT (De- vlin et al., 2019) (but without pre-training). To contrast the difference between a word’s literal and contextual representation (Mao et al., 2019) con- 1 Our code is available at: https://github.com/ Kumar-Tarun/metaphor-detection