Classication of Echocardiographic Standard Views Using a Hybrid Attention-based Approach Zi Ye 1 , Yogan Jaya Kumar 2 , Goh Ong Sing 2 , Fengyan Song 3 and Xianda Ni 4,* 1 School of Articial Intelligence, Wenzhou Polytechnic, Wenzhou, 325035, China 2 Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka, Melaka, 76100, Malaysia 3 Shanghai Gen Cong Information Technology Co. Ltd., Shanghai, 201300, China 4 Department of Ultrasonography, The First Afliated Hospital of Wenzhou Medical University, Wenzhou, 325003, China *Corresponding Author: Xianda Ni. Email: xianda.ni@gmail.com Received: 12 September 2021; Accepted: 26 November 2021 Abstract: The determination of the probe viewpoint forms an essential step in automatic echocardiographic image analysis. However, classifying echocardio- grams at the video level is complicated, and previous observations concluded that the most signicant challenge lies in distinguishing among the various adjacent views. To this end, we propose an ECHO-Attention architecture consisting of two parts. We rst design an ECHO-ACTION block, which efciently encodes Spatio-temporal features, channel-wise features, and motion features. Then, we can insert this block into existing ResNet architectures, combined with a self- attention module to ensure its task-related focus, to form an effective ECHO- Attention network. The experimental results are conrmed on a dataset of 2693 videos acquired from 267 patients that trained cardiologist has manually labeled. Our methods provide a comparable classication performance (overall accuracy of 94.81%) on the entire video sample and achieved signicant improve- ments on the classication of anatomically similar views (precision 88.65% and 81.70% for parasternal short-axis apical view and parasternal short-axis papillary view on 30-frame clips, respectively). Keywords: Articial intelligence; attention mechanism; classication; echocardiogram views 1 Introduction Echocardiography plays a vital role in diagnosing and treating cardiovascular diseases. It is the only imaging method that allows real-time and dynamic observation of the heart and immediate detection of various cardiac abnormalities [1]. However, accurate quantitative evaluation of cardiac structure has been a problem due to the operatorsmanipulation and the interpretation of echocardiography. For example, there are considerable differences among operators, especially for poor-quality images [2]. It has been proved that the differences between operators and within operators can be reduced with deep learning- based methods. We usually humans subconsciously perform specic steps during each examination. The rst critical preprocessing stage is the mode or view classication. Automating this task provides two This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Intelligent Automation & Soft Computing DOI:10.32604/iasc.2022.023555 Article ech T Press Science