2020 IEEE International Conference on Big Data (Big Data)
978-1-7281-6251-5/20/$31.00 ©2020 IEEE 3099
Interpretation of Sentiment Analysis with
Human-in-the-Loop
Vijaya Kumari Yeruva
Dept. of CSEE
University of Missouri-Kansas City
Kansas City, USA
vyq4b@mail.umkc.edu
Mayanka Chandrashekar
Dept. of CSEE
University of Missouri-Kansas City
Kansas City, USA
mckw9@mail.umkc.edu
Yugyung Lee
Dept. of CSEE
University of Missouri-Kansas City
Kansas City, USA
leeyu@umkc.edu
Jeff Rydberg-Cox
Dept. of English
University of Missouri-Kansas City
Kansas City, USA
rydbergcoxj@umkc.edu
Virginia Blanton
Dept. of English
University of Missouri-Kansas City
Kansas City, USA
blantonv@umkc.edu
Nathan A Oyler
Dept. of Chemistry
University of Missouri-Kansas City
Kansas City, USA
oylern@umkc.edu
Abstract—Human-in-the-Loop has been receiving special
attention from the data science and machine learning community.
It is essential to realize the advantages of human feedback and
the pressing need for manual annotation to improve machine
learning performance. Recent advancements in natural language
processing (NLP) and machine learning have created unique
challenges and opportunities for digital humanities research. In
particular, there are ample opportunities for NLP and machine
learning researchers to analyze data from literary texts and use
these complex source texts to broaden our understanding of
human sentiment using the human-in-the-loop approach. This
paper presents our understanding of how human annotators
differ from machine annotators in sentiment analysis tasks and
how these differences can contribute to designing systems for the
“human in the loop" sentiment analysis in complex, unstructured
texts. We further explore the challenges and benefits of the
human-machine collaboration for sentiment analysis using a case
study in Greek tragedy and address some open questions about
collaborative annotation for sentiments in literary texts. We focus
primarily on (i) an analysis of the challenges in sentiment analysis
tasks for humans and machines, and (ii) whether consistent
annotation results are generated from multiple human annotators
and multiple machine annotators. For human annotators, we
have used a survey-based approach with about 60 college
students. We have selected six popular sentiment analysis tools
for machine annotators, including VADER, CoreNLP’s sentiment
annotator, TextBlob, LIME, Glove+LSTM, and RoBERTa. We
have conducted a qualitative and quantitative evaluation with
the human-in-the-loop approach and confirmed our observations
on sentiment tasks using the Greek tragedy case study.
Index Terms—Human-in-the-loop, Natural Language
Processing (NLP), Sentiment Analysis, Greek tragedy, Machine,
and Human Annotations, Interactive Machine Learning
I. I NTRODUCTION
Human-in-the-Loop has been receiving special attention
from the data science and machine learning community.
[1], [2] It is essential to realize the advantages of human
feedback and the pressing need for manual annotation to
improve machine learning performance. The emergence of
these human-in-the-loop methodologies has created interesting
new opportunities for digital humanities research. In particular,
there are ample opportunities for NLP and machine learning
researchers to analyze data from literary texts and use these
complex sources to broaden our understanding of human
sentiment using the human-in-the-loop approach.
Storytelling and literary texts are built around formal genre
conventions that are essential for effective communication
[3]. This emphasizes the importance of understanding the
conventions of literature and its broader cultural contexts.
Human-in-the-loop workflows allow us to iteratively take
feedback from humans that factor in these considerations to
improve the ability to understand a broader range of texts with
computational tools.
Recent advancements in NLP and deep learning, such
as Glove+LSTM [4], and RoBERTa [5] have created
opportunities to integrate human annotation with machine
learning for digital humanities research. These tools make
it possible to conduct a systematic analysis of sentiments
and emotions in large collections of unstructured texts. Our
study compares the results from social media-trained sentiment
analysis packages with those provided by human annotators.
This comparison provides well-annotated data that can be used
to improve the computational models for sentiment analysis
and emotion detection tasks, which are complicated and hard
for machines without human input.
For this study, our work has focused on two primary
research questions:
• RQ1: What is the level of agreement between
multiple human and machine annotators when evaluating
sentiment? If the agreement is low, what are the reasons
behind it?
• RQ2: What are the primary topics associated with the
sentiment identified by either humans or machines and
2020 IEEE International Conference on Big Data (Big Data) | 978-1-7281-6251-5/20/$31.00 ©2020 IEEE | DOI: 10.1109/BigData50022.2020.9378221
Authorized licensed use limited to: University of Missouri-Kansas City. Downloaded on March 16,2023 at 13:49:04 UTC from IEEE Xplore. Restrictions apply.