Research Article Multimodal Sarcasm Detection: A Deep Learning Approach Santosh Kumar Bharti, 1 Rajeev Kumar Gupta , 1 Prashant Kumar Shukla , 2 Wesam Atef Hatamleh, 3 Hussam Tarazi, 4 and Stephen Jeswinde Nuagah 5 1 Pandit Deendayal Energy University, Gandhinagar, Gujarat, India 2 Department of Computer Science and Engineering, Koneru Lakshmaiah Education Foundation, Vaddeswaram, Guntur, 522502 Andhra Pradesh, India 3 Department of Computer Science, College of Computer and Information Sciences, King Saud University, P.O. Box 51178, Riyadh 11543, Saudi Arabia 4 Department of Computer Science and Informatics, School of Engineering and Computer Science, Oakland University, Rochester Hills, MI, USA 5 Department of Electrical Engineering, Tamale Technical University, Ghana Correspondence should be addressed to Stephen Jeswinde Nuagah; jeswinde@tatu.edu.gh Received 26 February 2022; Revised 23 March 2022; Accepted 28 March 2022; Published 23 May 2022 Academic Editor: Mohammad Farukh Hashmi Copyright © 2022 Santosh Kumar Bharti et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. In the modern era, posting sarcastic comments on social media became the common trend. Sarcasm is often used by people to taunt or pester others. It is frequently expressed through inﬂexion, tonal stress in speech or in the form of lexical, pragmatic, and hyperbolic features present in the text. Most of the existing work has been focused on either detecting sarcasm in textual data using text features or audio data using audio features. This article proposed a novel approach by combining textual and audio features together to detecting sarcasm in conversational data. This hybrid method takes a combined vector of extracted audio and text features from their respective models as the input. This combined features will compensated the shortcomings of only text features and vice-versa. The obtained result of hybrid model outperforms both the individual model signiﬁcantly. 1. Introduction With the growing number of virtual assistants with voice-to- voice interaction, detecting sarcasm has become crucial task. Artiﬁcial intelligence (AI) assistants like Siri, Alexa, and Cortana should be able to identify sarcasm in the user’s speech. In 2018, Google used an AI assistant to book an appointment for a haircut by a voice call [1]. So, in such sce- narios, it becomes very important that it is able to under- stand that humans are using sarcastic speech or not. Companies like Amazon and Flipkart rely heavily on cus- tomer feedback and reviews in order to improve their prod- ucts and services. Based on these reviews, they take data- driven decisions, so while analyzing them, it becomes very important that they are able to identify the exact intent behind it. In such scenarios, they should be able to detect whether the given text review is sarcastic or not. Macmillan English dictionary deﬁnes sarcasm as “the activity of saying or writing the opposite of what you mean, or of speaking in a way intended to make someone else feel stupid or show them that you are angry.” For example, in this statement “I love the pain present in the breakups,” a shift in sentiment can be observed. Even though the sentence tries to convey that one loves the pain present in breakups, but the actual meaning that it tries to convey is the exact opposite. While such patterns are an indication of the pres- ence of sarcasm in a statement, other lexical and pragmatic features as shown by [2, 3] also play an important role in the detection of sarcasm. In some cases, only text will not suﬃce to detect sarcasm, and additional cues would be required to detect it accurately. For example the phrase “yeah right” would have diﬀerent meanings which depends on how the person says it and what is the context behind it [4]. In such cases, factors like Hindawi Wireless Communications and Mobile Computing Volume 2022, Article ID 1653696, 10 pages https://doi.org/10.1155/2022/1653696