RLENS: RL-based Energy-Efﬁcient Network Selection Framework for IoMT Amr Abo-eleneen 1 , Alaa Awad Abdellatif 1 , Amr Mohamed 1 and Aiman Erbad 2 1 College of Engineering, Qatar University, Qatar. 2 College of Science and Engineering, Hamad Bin Khalifa University, Qatar. Email: {aa1405465, aawad, amrm} @qu.edu.qa and {aerbad}@ieee.org. Abstract—With the emergence of smart health (s-health) appli- cations and services, several requirements for quality have arisen to foresee and react instantaneously to emergency circumstances. Such requirements demand fast-acting wireless networks while adapting to various types of applications and environment dy- namics, encouraging network operators to leverage the spectrum of wireless signals across various radio access networks. Yet, this requires implementing intelligent network selection schemes that account for heterogeneous networks characteristics and applications’ QoS requirements. Thus, this paper tackles this problem by adopting an intelligent Reinforcement Learning (RL)- based network selection scheme. Speciﬁcally, we leverage edge computing capabilities to implement an efﬁcient user-centric network selection algorithm at the Internet of Medical Things (IoMT) level to adjust the compression ratio and select the most suitable radio access network (RAN) to transfer the acquired data while considering patient state, battery life and networks dynamics. Our results demonstrate the efﬁciency of the proposed approach in outperforming the state-of-the-art techniques in terms of battery life by more than 500% while reaching almost 85-90% of the optimal algorithm’s performance in delay and distortion. Keywords—Internet of Things, reinforcement learning, smart health, energy efﬁciency, network selection. I. I NTRODUCTION The numerous evolution of Artiﬁcial Intelligence (AI), Internet of Medical Things (IoMT), and Big data is paving the way towards a plethora of smart-health (s-health) applications. S-health is considered as the next evolution of healthcare systems towards the Health 4.0 revolution [1]. However, to enable such applications and build real-time interactive sys- tems, the underlying network must support ultra-reliable and low-latency services. This calls for exploiting the expansion of the ﬁfth-generation (5G) network towards a diversiﬁed and heterogeneous network (HetNet). Utilizing HetNet with multi-Radio Access Networks (RAN) enables every device to leverage the feasible radio resources among different fre- quency ranges to connect with the network’s infrastructure. Thus, the cutting-edge devices that are equipped with multiple interfaces, i.e.(Bluetooth, 3G, WiFi, 4G) will be capable of ac- cessing the usable networks simultaneously. Yet, this requires the design of ingenious network selection schemes that deal with s-health strict demands while providing reasonably high efﬁciency across various spectrums of different RANs. Several methodologies have been adopted in the literature for solving the network selection problem, including: opti- mization techniques [2], [3], game theory [4]–[6], Markov decision processes (MDPs) [7], [8], and multi-attribute de- cision making [9], [10]. However, most of these approaches build on complex mathematical models and instantaneous channels information. Indeed, to guaranty the optimality, while considering diverse networks, applications, and power constraints usually result in an NP-hard problem [2]. Also, leveraging game theory, MDPs, or multi-attribute decision- making approaches is computationally intensive, especially in the case of large networks, and their convergence is not guaranteed. Even if they converge, it is not always guaranteed to converge to an optimal solution. Accordingly, relying on traditional network selection methodologies, which heavily rely on mathematical models and consider only instantaneous network state, can not cope with the highly-dynamic environments nor the next generation network demand for swift connectivity and quick responsivity. To address these challenges, in this paper we opt to leverage the potential of Reinforcement Learning (RL) [11] to develop an intelligent, user-centric network selection scheme for s- health systems. Although few studies have been presented for network selection using the Q-learning method [12], or RL [13], [14], we are still at the beginning level. Thus, this paper aims at advancing the state-of-the-art by: 1) Deﬁning a holistic network selection problem that opti- mally selects the adequate RAN and compression ratio at the patient level while considering the data distor- tion, patients’ state, and end-to-end delay, i.e., due to processing, transmission, and queuing. 2) Leveraging efﬁcient RL-based algorithm that considers all system’s dynamics to solve the formulated problem. Indeed, we formulate a multi-objective reward function that features the trade-off between energy consumption, delay, and distortion. 3) Conducting comparative experiments to demonstrate the performance of the proposed scheme in comparison to two baselines, namely, energy-greedy and quality- greedy, in addition to a state-of-the-art algorithm. 4) Demonstrating the adaptiveness of our approach to swift network dynamics through introducing some disturbance to the converged RAN. Our results depict how the proposed scheme can adapt to diverse network dynamics while changing the action distributions with a reasonable number of episodes. 978-1-7281-8678-8/22/$31.00 ©2022 IEEE 2022 Wireless Telecommunications Symposium (WTS) | 978-1-7281-8678-8/22/$31.00 ©2022 IEEE | DOI: 10.1109/WTS53620.2022.9768166 Authorized licensed use limited to: Qatar University. Downloaded on June 10,2022 at 10:57:52 UTC from IEEE Xplore. Restrictions apply.