Handover Control in Wireless Systems via Asynchronous Multi-User Deep Reinforcement Learning

156 0 0.0 ( 0 )

Download Cite

Added by Zhi Wang

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Zhi Wang - Lihua Li - Yue Xu

Networking and Internet Architecture

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this paper, we propose a two-layer framework to learn the optimal handover (HO) controllers in possibly large-scale wireless systems supporting mobile Internet-of-Things (IoT) users or traditional cellular users, where the user mobility patterns could be heterogeneous. In particular, our proposed framework first partitions the user equipments (UEs) with different mobility patterns into clusters, where the mobility patterns are similar in the same cluster. Then, within each cluster, an asynchronous multi-user deep reinforcement learning scheme is developed to control the HO processes across the UEs in each cluster, in the goal of lowering the HO rate while ensuring certain system throughput. In this scheme, we use a deep neural network (DNN) as an HO controller learned by each UE via reinforcement learning in a collaborative fashion. Moreover, we use supervised learning in initializing the DNN controller before the execution of reinforcement learning to exploit what we already know with traditional HO schemes and to mitigate the negative effects of random exploration at the initial stage. Furthermore, we show that the adopted global-parameter-based asynchronous framework enables us to train faster with more UEs, which could nicely address the scalability issue to support large systems. Finally, simulation results demonstrate that the proposed framework can achieve better performance than the state-of-art on-line schemes, in terms of HO rates.

rate research

Scheduling and Power Control for Wireless Multicast Systems via Deep Reinforcement Learning

143 - Ramkumar Raghu , Mahadesh Panju , Vaneet Aggarwal 2020

Multicasting in wireless systems is a natural way to exploit the redundancy in user requests in a Content Centric Network. Power control and optimal scheduling can significantly improve the wireless multicast networks performance under fading. However, the model based approaches for power control and scheduling studied earlier are not scalable to large state space or changing system dynamics. In this paper, we use deep reinforcement learning where we use function approximation of the Q-function via a deep neural network to obtain a power control policy that matches the optimal policy for a small network. We show that power control policy can be learnt for reasonably large systems via this approach. Further we use multi-timescale stochastic optimization to maintain the average power constraint. We demonstrate that a slight modification of the learning algorithm allows tracking of time varying system statistics. Finally, we extend the multi-timescale approach to simultaneously learn the optimal queueing strategy along with power control. We demonstrate scalability, tracking and cross layer optimization capabilities of our algorithms via simulations. The proposed multi-timescale approach can be used in general large state space dynamical systems with multiple objectives and constraints, and may be of independent interest.

Networking and Internet Architecture Machine Learning Machine Learning

Deep Reinforcement Learning Based Power control for Wireless Multicast Systems

117 - Ramkumar Raghu , Pratheek Upadhyaya , Mahadesh Panju 2019

We consider a multicast scheme recently proposed for a wireless downlink in [1]. It was shown earlier that power control can significantly improve its performance. However for this system, obtaining optimal power control is intractable because of a very large state space. Therefore in this paper we use deep reinforcement learning where we use function approximation of the Q-function via a deep neural network. We show that optimal power control can be learnt for reasonably large systems via this approach. The average power constraint is ensured via a Lagrange multiplier, which is also learnt. Finally, we demonstrate that a slight modification of the learning algorithm allows the optimal control to track the time varying system statistics.

Networking and Internet Architecture Machine Learning Machine Learning

QoS and Jamming-Aware Wireless Networking Using Deep Reinforcement Learning

126 - Nof Abuzainab , Tugba Erpek , Kemal Davaslioglu 2019

The problem of quality of service (QoS) and jamming-aware communications is considered in an adversarial wireless network subject to external eavesdropping and jamming attacks. To ensure robust communication against jamming, an interference-aware routing protocol is developed that allows nodes to avoid communication holes created by jamming attacks. Then, a distributed cooperation framework, based on deep reinforcement learning, is proposed that allows nodes to assess network conditions and make deep learning-driven, distributed, and real-time decisions on whether to participate in data communications, defend the network against jamming and eavesdropping attacks, or jam other transmissions. The objective is to maximize the network performance that incorporates throughput, energy efficiency, delay, and security metrics. Simulation results show that the proposed jamming-aware routing approach is robust against jamming and when throughput is prioritized, the proposed deep reinforcement learning approach can achieve significant (measured as three-fold) increase in throughput, compared to a benchmark policy with fixed roles assigned to nodes.

Networking and Internet Architecture Machine Learning

Reinforcement Learning Random Access for Delay-Constrained Heterogeneous Wireless Networks: A Two-User Case

216 - Danzhou Wu , Lei Deng , Zilong Liu 2021

In this paper, we investigate the random access problem for a delay-constrained heterogeneous wireless network. As a first attempt to study this new problem, we consider a network with two users who deliver delay-constrained traffic to an access point (AP) via a common unreliable collision wireless channel. We assume that one user (called user 1) adopts ALOHA and we optimize the random access scheme of the other user (called user 2). The most intriguing part of this problem is that user 2 does not know the information of user 1 but needs to maximize the system timely throughput. Such a paradigm of collaboratively sharing spectrum is envisioned by DARPA to better dynamically match the supply and demand in the future [1], [2]. We first propose a Markov Decision Process (MDP) formulation to derive a modelbased upper bound, which can quantify the performance gap of any designed schemes. We then utilize reinforcement learning (RL) to design an R-learning-based [3]-[5] random access scheme, called TSRA. We finally carry out extensive simulations to show that TSRA achieves close-to-upper-bound performance and better performance than the existing baseline DLMA [6], which is our counterpart scheme for delay-unconstrained heterogeneous wireless network. All source code is publicly available in https://github.com/DanzhouWu/TSRA.

Networking and Internet Architecture

Handover Management for mmWave Networks with Proactive Performance Prediction Using Camera Images and Deep Reinforcement Learning

79 - Yusuke Koda , Kota Nakashima , Koji Yamamoto 2019

For millimeter-wave networks, this paper presents a paradigm shift for leveraging time-consecutive camera images in handover decision problems. While making handover decisions, it is important to predict future long-term performance---e.g., the cumulative sum of time-varying data rates---proactively to avoid making myopic decisions. However, this study experimentally notices that a time-variation in the received powers is not necessarily informative for proactively predicting the rapid degradation of data rates caused by moving obstacles. To overcome this challenge, this study proposes a proactive framework wherein handover timings are optimized while obstacle-caused data rate degradations are predicted before the degradations occur. The key idea is to expand a state space to involve time consecutive camera images, which comprises informative features for predicting such data rate degradations. To overcome the difficulty in handling the large dimensionality of the expanded state space, we use a deep reinforcement learning for deciding the handover timings. The evaluations performed based on the experimentally obtained camera images and received powers demonstrate that the expanded state space facilitates (i) the prediction of obstacle-caused data rate degradations from 500 ms before the degradations occur and (ii) superior performance to a handover framework without the state space expansion

Networking and Internet Architecture