Scientific Journal of Informatics Vol. 10, No. 2, May 2023 p-ISSN 2407-7658 http://journal.unnes.ac.id/nju/index.php/sji e-ISSN 2460-0040 Scientific Journal of Informatics, Vol. 10, No. 2, May 2023 | 159 A Systematic Literature Review of Multimodal Emotion Recognition Yeni Dwi Rahayu 1* , Lutfi Ali Muharrom 2 , Ika Safitri Windiarti 3 , Auratania Hadisah Sugianto 4 1,2,4 Department of Informatics Engineering, Faculty of Engineering, Universitas Muhammadiyah Jember, Indonesia 3 Information Technology Study Programme, Universiti Muhammadiyah Malaysia Abstract. Purpose: This literature review aims to identify Multimodal Emotion Recognition (MER) in depth and breadth by analyzing the topics, trends, modalities, and other supporting sources discussed in research over the years and between 2010 and 2022. Based on the screening analysis, a total of 14,533 articles were analyzed to achieve this goal. Methods: This research was conducted in 3 (three) phases, including Planning, Conducting and Reporting. The first step was defining the research objectives by searching for systematic reviews with similar topics to this study, then reviewing them to develop research questions and systematic review protocols for this study. The second stage is to collect articles according to a pre-determined protocol, selecting the articles obtained and then conducting an analysis of the filtered articles in order to answer the research questions. The final stage is to summarize the results of the analysis so new findings from this research can be reported. Result: In general, the focus of MER research can be categorized into two issues, namely the object background and the source or modality of emotion recognition. When looking at the object background, most of the 55% to support emotion recognition with a health background, especially brain function decline, 34% based on age, 10% based on gender, 1% data collection situation and a small portion of less than 1% related to ethnic culture. In terms of the source of emotion recognition, research is divided into electromagnetic signals, voice signals, text, photo/video and the development of wearable devices. Based on the above results, there are at least 7 scientific fields that discuss MER research, namely health, psychology, electronics, grammar, communication, socio-culture and computer science. Novelty: MER research has the potential to develop further. There are still many areas that have received less attention, while the ecosystem that uses them has grown massively. Emotion recognition modalities are numerous and diverse, but research is still focused on validating the emotions of each modality, rather than exploring the strengths of each modality to improve the quality of recognition results. Keywords: Emotion Recognition, Modalities, Research Topics Received April 2023 / Revised April 2023 / Accepted May 2023 This work is licensed under a Creative Commons Attribution 4.0 International License. INTRODUCTION Emotion recognition may assist humans to understand themselves, understand others, as well as improve the overall quality of life [1]. Therefore, the study of emotion recognition continues to grow significantly [2]. Emotion recognition itself is a complex phenomenon that involves psychological, physiological, and social aspects. Psychological aspects include how emotions are processed and interpreted by the brain, while physiological aspects involve the physiological changes that occur in the body when a person experiences emotions, such as heart rate and endocrine gland activity. On the other hand, the social aspect relates to the way emotions are expressed and influenced by the social and cultural environment. Therefore, to understand emotions well, a holistic understanding is required and involves various fields of science, such as psychology, neuroscience, biology, anthropology and sociology [3]. The complexity of emotion recognition demands that emotion recognition computing research also involves various aspects of modalities to improve the quality of emotion recognition. This phenomenon has been recognized by the emergence of various studies on the topic of multimodal emotion recognition [4], [5] with challenges that are still wide open. Ya Li defines the challenges of multimodal emotion recognition around datasets, variations in recognition sources (audio, image, video), cultural influences and * Corresponding author. Email addresses: yenidwirahayu@unmuhjember.ac.id (Rahayu), lutfi.muharom@unmuhjember.ac.id (Muharom), ika.windiarti@umam.edu.my (Windiarti), auratania036@gmail.com (Sugianto) DOI: 10.15294/sji.v10i2.43792