Optimization Technique for Deep Learning Methodology on Power Side Channel Attacks Amjed Abbas Ahmed Center for Cyber Security Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia (UKM) Bangi, 43600, Malaysia Department of Computer Techniques Engineering, Imam Al-Kadhum College (IKC) Baghdad 10011, Iraq amjedabbas@alkadhum-col.edu.iq Azana Hafizah Aman Center for Cyber Security Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia (UKM) Bangi, 43600, Malaysia azana@ukm.edu.my Mohammad Kamrul Hasan Center for Cyber Security Faculty of Information Science and Technology, Universiti Kebangsaan Malaysia (UKM) Bangi, 43600, Malaysia mkhasan@ukm.edu.my Shayla Islam Institute of Computer Science and Digital Innovation UCSI University Kuala Lumpur 56000, Malaysia shayla@ucsiuniversity.edu.my Nazmus Shaker Nafi GS-APAC, Kellogg Brown and Root, Southbank, Melbourne, Australia nazmus.nafi@kbr.com Mohammad Siab Nahi Department of Computer Techniques Engineering, Imam Al-Kadhum College (IKC) Baghdad 10011, Iraq mohammed.saib@alkadhum-col.edu.iq Abstract— The first non-profiled side-channel attack (SCA) method using deep learning is Timon's Differential Deep Learning Analysis (DDLA). This method is effective in retrieving the secret key with the help of deep learning metrics. The Neural Network (NN) has to be trained numerous times since the proposed approach increases the learning cost with the key sizes, making it hard to assess the results from the intermediate stage. In this research, we provide three possible answers to the issues raised above, along with any challenges that could result from trying to solve these issues. We will start by offering an updated algorithm that has been modified to be able to keep track of the metrics during the intermediary stage. Next, we provide a parallel NN structure and training technique for a single network. This saves a lot of time by eliminating the need to repeatedly retrain the same model. The newly designed algorithm significantly sped up attacks when compared to the previous one. Thus, we propose employing shared layers to overcome memory challenges in parallel structure and improve performance. We evaluated our approaches by presenting non- profiled attacks on ASCAD dataset and a ChipWhisperer-Lite power usage dataset. Power utilisation was studied using both datasets. The shared layers strategy we created was up to 134 times more successful than the prior technique when used to the ASCAD database. Index Terms— Optimization Technique, Deep Learning Methodology, Side Channel Attacks, CNN and Power Analysis. I. INTRODUCTION Cryptographic methods [1] in hardware protect against mathematical cryptanalysis and physical attacks because attackers can access physical equipment. This offers protection from both categories of dangers. Side-channel attacks are physical attacks that use information leaked from cryptographic hardware to render a security system ineffective [2]. Information about runtime, noise, temperature, power use, and EM radiation are just a few examples. Side-channel attacks on real-world products like phones and transit cards are becoming more common and effective. This paper has been supported by the Universiti Kebangsaan Malaysia (UKM), Under grant scheme, DIP-2022-021. Profiled and non-profiled attacks [3], [4] are two types of side-channel attacks that can be distinguished based on the environment in which an attacker works. A case of SCA is a profiled attack, such as the Template Attack [5] or the Stochastic attack [6]. It involves using a fixed secret key and a profiling device that is structurally similar to the device that will be the target of the attack. Attackers will utilise profiling tools to profile the target device's leakage first, and then they will use the knowledge they have gained to analyse the target device. A non-profiled attack, on the other hand, is a kind of side-channel attack that takes place in a setting without any profiling devices. In order to evaluate secret keys, attackers will use statistical methods in conjunction with the measurements they have collected from the target device [7]. The secret key must be recalculated from scratch for these attacks to succeed, utilising just the side-channel data gathered from the compromised devices and none of the previously established templates. Differential Power Analysis [8] and Correlation Power Analysis [9] are two examples of attacks that do not rely on profiles [10]. II. LITERATURE REVIEW The first kind of DL-SCAs to be utilised in non-profiled attacks, or circumstances when an attacker is unable to get a template device, is differential deep learning analysis, also known as DDLA [11]. It becomes challenging to use the deep learning-based profiling side-channel attacks that were covered in the previous part since it is impossible to gather the label for side-channel measurements in the absence of a template device. However, since the DDLA provides a method for estimating labels, it is able to get around these limitations. The intermediate values that are most closely related to the measurements are those that are calculated with the right key in a correlation power analysis, a version of the usual SCA. The same is true in deep learning research, where a single network trained with the proper label outperforms all those trained with the wrong labels. Different metrics are used to quantify deep learning's effectiveness, but the 2023 33rd International Telecommunication Networks and Applications Conference (ITNAC) 979-8-3503-1713-8/23/$31.00 ©2023 IEEE 80 2023 33rd International Telecommunication Networks and Applications Conference (ITNAC) | 979-8-3503-1713-8/23/$31.00 ©2023 IEEE | DOI: 10.1109/ITNAC59571.2023.10368481 Authorized licensed use limited to: Universiti Malaysia Perlis. Downloaded on January 02,2024 at 06:26:46 UTC from IEEE Xplore. Restrictions apply.