Research Article
Multimodal Sarcasm Detection: A Deep Learning Approach
Santosh Kumar Bharti,
1
Rajeev Kumar Gupta ,
1
Prashant Kumar Shukla ,
2
Wesam Atef Hatamleh,
3
Hussam Tarazi,
4
and Stephen Jeswinde Nuagah
5
1
Pandit Deendayal Energy University, Gandhinagar, Gujarat, India
2
Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur,
522502 Andhra Pradesh, India
3
Department of Computer Science, College of Computer and Information Sciences, King Saud University, P.O. Box 51178,
Riyadh 11543, Saudi Arabia
4
Department of Computer Science and Informatics, School of Engineering and Computer Science, Oakland University,
Rochester Hills, MI, USA
5
Department of Electrical Engineering, Tamale Technical University, Ghana
Correspondence should be addressed to Stephen Jeswinde Nuagah; jeswinde@tatu.edu.gh
Received 26 February 2022; Revised 23 March 2022; Accepted 28 March 2022; Published 23 May 2022
Academic Editor: Mohammad Farukh Hashmi
Copyright © 2022 Santosh Kumar Bharti et al. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is
properly cited.
In the modern era, posting sarcastic comments on social media became the common trend. Sarcasm is often used by people to
taunt or pester others. It is frequently expressed through inflexion, tonal stress in speech or in the form of lexical, pragmatic,
and hyperbolic features present in the text. Most of the existing work has been focused on either detecting sarcasm in textual
data using text features or audio data using audio features. This article proposed a novel approach by combining textual and
audio features together to detecting sarcasm in conversational data. This hybrid method takes a combined vector of extracted
audio and text features from their respective models as the input. This combined features will compensated the shortcomings
of only text features and vice-versa. The obtained result of hybrid model outperforms both the individual model significantly.
1. Introduction
With the growing number of virtual assistants with voice-to-
voice interaction, detecting sarcasm has become crucial task.
Artificial intelligence (AI) assistants like Siri, Alexa, and
Cortana should be able to identify sarcasm in the user’s
speech. In 2018, Google used an AI assistant to book an
appointment for a haircut by a voice call [1]. So, in such sce-
narios, it becomes very important that it is able to under-
stand that humans are using sarcastic speech or not.
Companies like Amazon and Flipkart rely heavily on cus-
tomer feedback and reviews in order to improve their prod-
ucts and services. Based on these reviews, they take data-
driven decisions, so while analyzing them, it becomes very
important that they are able to identify the exact intent
behind it. In such scenarios, they should be able to detect
whether the given text review is sarcastic or not.
Macmillan English dictionary defines sarcasm as “the
activity of saying or writing the opposite of what you mean,
or of speaking in a way intended to make someone else feel
stupid or show them that you are angry.” For example, in
this statement “I love the pain present in the breakups,” a
shift in sentiment can be observed. Even though the sentence
tries to convey that one loves the pain present in breakups,
but the actual meaning that it tries to convey is the exact
opposite. While such patterns are an indication of the pres-
ence of sarcasm in a statement, other lexical and pragmatic
features as shown by [2, 3] also play an important role in
the detection of sarcasm.
In some cases, only text will not suffice to detect sarcasm,
and additional cues would be required to detect it accurately.
For example the phrase “yeah right” would have different
meanings which depends on how the person says it and
what is the context behind it [4]. In such cases, factors like
Hindawi
Wireless Communications and Mobile Computing
Volume 2022, Article ID 1653696, 10 pages
https://doi.org/10.1155/2022/1653696