Dynamic Systems and Applications 30 (2021) No. 11, 1719 - 1732
REINFORCEMENT LEARNING BASED HANDOFF MECHANISM IN
COOPERATIVE COGNITIVE RADIO NETWORKS
Vineetha Mathai
1*
And P. Indumathi
2
*1, 2
Department of Electronics Engineering, MIT Campus, Anna University, Chennai. Mail id:
vineethamathai@gmail.com
ABSTRACT. The spectrum handoff (SH) is a dynamic spectrum access technique which ensures effective
channel utilization, fair resource allocation, as well as uninterrupted real-time connection. Facilitating SH across
traffics of dissimilar characteristics in Cognitive Radio Networks (CRNs) is posing difficulty due to manifold
interventions from Primary Users (PUs), disagreement among Secondary Users (SUs) and diversified Quality of
Experience (QoE) demand. Here, we consider effective channel selection strategy (CSS) and put forward a
learning-based handoff scheme to enhance QoE demand of users by the introduction of docition idea. A PU
prioritized Markov method is introduced to represent the communications between PUs and SUs for even channel
access. The reinforcement learning (RL) is applied to CSS to carry out proper channel selection. Numerical
outcomes projects that proposed queuing model, suggested learning based handoff scheme and docitive learning
enhances the quality of service by maintaining the average MOS of 3.6.
Keywords: Cognitive radio network, Spectrum handoff, Queuing Model, Reinforcement Learning, QoE.
1 INTRODUCTION
The progression of wireless communication towards 5G includes changes in network
model and assessment of providing QoE for multimedia applications. The term CRN is coined
to mitigate the effect of underutilization of spectrum resources [1],[2]. In CRN, unlicensed
users (SUs) are having chance to access the spectrum only when it is not engaged by licensed
users (PUs). If a PU returns on a channel, SU can either stay on it or shift (ie., handoff) to
another one until the completion of PU’s data transmission. If cognitive radio is shadowed by
a high building over the sensing channel, then cooperative mechanism is included.
Proactive, reactive and hybrid handoff [10] are the various methods available in the
literature. In the proactive method, to characterize PU’s activities, to identify channels and to
accomplish switching on revisit of PU, SUs uses the information of PU traffic model. So,
handoff delay of this scheme is less but to get precise traffic model of PU is difficult. In the
reactive mode, an SU does spectrum sensing initially when a PU interruption happens to
identify vacant channels. So, channel status for handoff could be found without difficulty.
However, it may bring delay. In hybrid method, a speedy method has combination effects of
earlier methods by means of the proactive sensing and reactive handoff action [3-5].
Multimedia applications [12],[15] is difficult to introduce in CRN due to intervention
of PUs and different requirements of QoE. In order to tackle previous problems we select a
mixed preemptive and non-preemptive resume priority (PRP/NPRP) M/G/1 [31] queueing
model to describe behavior of PUs and SUs on spectrum usage. Here, the former model is used
to describe the queueing of the PUs and SUs and to ensure that PUs have control. To avoid an
SU from intruding the current communication of other SUs, the queueing between them is
modeled with latter model. When picking channels for SH, it is significant to study the
transmission delay, channel quality and conditions.
The varying channel situations and tr affic loads, the knowledge gained from prior SHs
and earlier channel environments, a reinforcement learning-based [18]-[20],[22]SH scheme is
proposed to adaptively achieve SH[7],[15-16]. The main parameter of QoE [30] is mean
Received JUL 12, 2021 ISSN1056-2176(Print); ISSN 2693-5295 (online)
www.dynamicpublishers.com; $15.00 ©Dynamic Publishers, Inc.
www.dynamicpublishers.org; https://doi.org/10.46719/dsa202130.11.03