SiSP: Japanese Situation-dependent Sentiment Polarity Dictionary Atsushi Takada a_takada@hal.t.u-tokyo.ac.jp Dept. of Info. and Comm. Eng., The University of Tokyo Tokyo, Japan Yoshinobu Kano kano@inf.shizuoka.ac.jp Fac. of Info, Shizuoka University Shizuoka, Japan Toshihiko Yamasaki yamasaki@cvm.t.u-tokyo.ac.jp Dept. of Info. and Comm. Eng., The University of Tokyo Tokyo, Japan ABSTRACT In order to deal with the variety of meanings and contexts of words, we created a Japanese Situation-dependent Sentiment Polarity Dic- tionary (SiSP) of sentiment values labeled for 20 diferent situations. This dictionary was annotated by crowdworkers with 25,520 Japan- ese words, and consists of 10 responses for each situation of each word. Using our SiSP, we predicted the polarity of each word in the dictionary and that of dictionary words in sentences considering the context. In both experiments, situation-dependent prediction showed superior results in determining emotional polarity. CCS CONCEPTS · Computing methodologies Language resources. KEYWORDS Datasets, Sentiment, Dictionary, Situation ACM Reference Format: Atsushi Takada, Yoshinobu Kano, and Toshihiko Yamasaki. 2022. SiSP: Japanese Situation-dependent Sentiment Polarity Dictionary. In Proceedings of the 2022 International Joint Workshop on Multimedia Artworks Analy- sis and Attractiveness Computing in Multimedia (MMArt-ACM ’22), June 27ś30, 2022, Newark, NJ, USA. ACM, New York, NY, USA, 6 pages. https: //doi.org/10.1145/3512730.3533716 1 INTRODUCTION Understanding human emotions from facial images, voice, texts, and so on is becoming very important both in academia and in- dustry. Emotion polarity dictionaries are used to analyze emotions from texts. Most of the existing emotion polarity dictionaries are based on a single word labeled as positive or negative, or they only classify words into a number of class categories. However, even a single word can have many diferent meanings and give a diferent impression when used in diferent contexts and situations. For ex- ample, the word fast can be positive when it means that a racing car is fast, but it can have a negative meaning when you are walking with a friend and you want to complain that your friend is walking too fast. Many current emotion polarity dictionaries have only a single label and cannot handle such a variety of situations and meanings. Meanwhile, emotion polarity dictionaries that consider Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for proft or commercial advantage and that copies bear this notice and the full citation on the frst page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specifc permission and/or a fee. Request permissions from permissions@acm.org. MMArt-ACM ’22, June 27ś30, 2022, Newark, NJ, USA © 2022 Association for Computing Machinery. ACM ISBN 978-1-4503-9240-2/22/06. . . $15.00 https://doi.org/10.1145/3512730.3533716 various categories are annotated only with class labels and ignore the strength of the emotion polarity of words in the category. In this study, we developed a Situation-dependent Sentiment Polarity Dictionary (SiSP) with individual numerical labels for 20 diferent situations. To the best of our knowledge, SiSP is the frst situation-dependent sentiment polarity dictionary. We will make it an open source upon acceptance. In addition, we have demonstrated the baseline performance of the polarity prediction of words in two scenarios: that of an individual word and that with context. 2 RELATED WORKS 2.1 Sentiment lexicon Most sentiment lexicons are lists of words labeled in a positive or negative direction. They are often created manually due to the subjective nature of sentiment labels. Linguistic Inquiry and Word Count (LIWC) [9] is a dictionary of over 6,000 words classifed into 125 categories. This dictionary has been used to extract political sentiments from tweets and to predict the onset of depression from SNS text. The Afective Norms for English Words (ANEW) lexicon [3] consists of 1,024 English words labeled from 1 to 9 in terms of the Valence-Arousal-Dominance (VAD) model. SentiWordNet [5][1] is an extension of WordNet [8] that scores words on a scale of 0.0 to 1.0 for positive, negative, and neutral, and is normalized so that the sum of each category score is 1. SentiWordNet is also labeled in a semi-supervised manner. Many words are classifed as neutral, with no polarity and a very high level of noise. The SiSP created in this study has a numerical value from 0 to 1 for each of the 20 diferent situations with labels of positive, negative, neutral (between positive and negative), irrelevant (the word has nothing to do the situation), and unintelligible. 2.2 Named Entity Recognition Named Entity Recognition (NER) is a task to extract unique ex- pressions contained in sentences. It extracts Named Entities from sentences and classifes them into proper nouns such as names of people, organizations, and places, and predefned expressions such as dates, time expressions, quantities, and amounts. For these expressions, a distinction is made between between (B) for the frst one and inside (I) for the second one. Tokens that do not belong to any entity are assigned outside (O). This distinction is called BIO notation. For example, in the sentence ‘Mark Watney visited Mars’, if the person tag is ‘Person’ and the location tag is ‘Location’, Mark is a B-Person, Watney is an I-Person, visited is an O because it does not belong to any token. Some tasks classify place names into detailed locations such as cities, states, countries, etc., and some Oral Session MMArt-ACM ’22, June 27, 2022, Newark, NJ, USA 1