Deep Reinforcement Learning-Aided RAN Slicing Enforcement for B5G Latency Sensitive Services

109 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Sergio Martiradonna

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Sergio Martiradonna - Andrea Abrardo - Marco Moretti

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The combination of cloud computing capabilities at the network edge and artificial intelligence promise to turn future mobile networks into service- and radio-aware entities, able to address the requirements of upcoming latency-sensitive applications. In this context, a challenging research goal is to exploit edge intelligence to dynamically and optimally manage the Radio Access Network Slicing (that is a less mature and more complex technology than fifth-generation Network Slicing) and Radio Resource Management, which is a very complex task due to the mostly unpredictably nature of the wireless channel. This paper presents a novel architecture that leverages Deep Reinforcement Learning at the edge of the network in order to address Radio Access Network Slicing and Radio Resource Management optimization supporting latency-sensitive applications. The effectiveness of our proposal against baseline methodologies is investigated through computer simulation, by considering an autonomous-driving use-case.

قيم البحث

363 - Rongpeng Li , Zhifeng Zhao , Qi Sun 2018

Network slicing is born as an emerging business to operators, by allowing them to sell the customized slices to various tenants at different prices. In order to provide better-performing and cost-efficient services, network slicing involves challengi ng technical issues and urgently looks forward to intelligent innovations to make the resource management consistent with users activities per slice. In that regard, deep reinforcement learning (DRL), which focuses on how to interact with the environment by trying alternative actions and reinforcing the tendency actions producing more rewarding consequences, is assumed to be a promising solution. In this paper, after briefly reviewing the fundamental concepts of DRL, we investigate the application of DRL in solving some typical resource management for network slicing scenarios, which include radio resource slicing and priority-based core network slicing, and demonstrate the advantage of DRL over several competing schemes through extensive simulations. Finally, we also discuss the possible challenges to apply DRL in network slicing from a general perspective.

بنية الشبكات والإنترنت التعلم الآلي

RAN Resource Slicing in 5G Using Multi-Agent Correlated Q-Learning

120 - Hao Zhou , Medhat Elsayed , Melike Erol-Kantarci 2021

5G is regarded as a revolutionary mobile network, which is expected to satisfy a vast number of novel services, ranging from remote health care to smart cities. However, heterogeneous Quality of Service (QoS) requirements of different services and li mited spectrum make the radio resource allocation a challenging problem in 5G. In this paper, we propose a multi-agent reinforcement learning (MARL) method for radio resource slicing in 5G. We model each slice as an intelligent agent that competes for limited radio resources, and the correlated Q-learning is applied for inter-slice resource block (RB) allocation. The proposed correlated Q-learning based interslice RB allocation (COQRA) scheme is compared with Nash Q-learning (NQL), Latency-Reliability-Throughput Q-learning (LRTQ) methods, and the priority proportional fairness (PPF) algorithm. Our simulation results show that the proposed COQRA achieves 32.4% lower latency and 6.3% higher throughput when compared with LRTQ, and 5.8% lower latency and 5.9% higher throughput than NQL. Significantly higher throughput and lower packet drop rate (PDR) is observed in comparison to PPF.

بنية الشبكات والإنترنت أنظمة وتحكم أنظمة وتحكم

Deep Reinforcement Learning-Based Beam Tracking for Low-Latency Services in Vehicular Networks

77 - Yan Liu , Zhiyuan Jiang , Shunqing Zhang 2020

Ultra-Reliable and Low-Latency Communications (URLLC) services in vehicular networks on millimeter-wave bands present a significant challenge, considering the necessity of constantly adjusting the beam directions. Conventional methods are mostly base d on classical control theory, e.g., Kalman filter and its variations, which mainly deal with stationary scenarios. Therefore, severe application limitations exist, especially with complicated, dynamic Vehicle-to-Everything (V2X) channels. This paper gives a thorough study of this subject, by first modifying the classical approaches, e.g., Extended Kalman Filter (EKF) and Particle Filter (PF), for non-stationary scenarios, and then proposing a Reinforcement Learning (RL)-based approach that can achieve the URLLC requirements in a typical intersection scenario. Simulation results based on a commercial ray-tracing simulator show that enhanced EKF and PF methods achieve packet delay more than $10$ ms, whereas the proposed deep RL-based method can reduce the latency to about $6$ ms, by extracting context information from the training data.

نظرية المعلومات التعلم الآلي معالجة الإشارات

Robust and Scalable Routing with Multi-Agent Deep Reinforcement Learning for MANETs

183 - Saeed Kaviani , Bo Ryu , Ejaz Ahmed 2021

Highly dynamic mobile ad-hoc networks (MANETs) are continuing to serve as one of the most challenging environments to develop and deploy robust, efficient, and scalable routing protocols. In this paper, we present DeepCQ+ routing which, in a novel ma nner, integrates emerging multi-agent deep reinforcement learning (MADRL) techniques into existing Q-learning-based routing protocols and their variants, and achieves persistently higher performance across a wide range of MANET configurations while training only on a limited range of network parameters and conditions. Quantitatively, DeepCQ+ shows consistently higher end-to-end throughput with lower overhead compared to its Q-learning-based counterparts with the overall gain of 10-15% in its efficiency. Qualitatively and more significantly, DeepCQ+ maintains remarkably similar performance gains under many scenarios that it was not trained for in terms of network sizes, mobility conditions, and traffic dynamics. To the best of our knowledge, this is the first successful demonstration of MADRL for the MANET routing problem that achieves and maintains a high degree of scalability and robustness even in the environments that are outside the trained range of scenarios. This implies that the proposed hybrid design approach of DeepCQ+ that combines MADRL and Q-learning significantly increases its practicality and explainability because the real-world MANET environment will likely vary outside the trained range of MANET scenarios.

بنية الشبكات والإنترنت الذكاء الاصطناعي التعلم الآلي

GAN-powered Deep Distributional Reinforcement Learning for Resource Management in Network Slicing

251 - Yuxiu Hua , Rongpeng Li , Zhifeng Zhao 2019

Network slicing is a key technology in 5G communications system. Its purpose is to dynamically and efficiently allocate resources for diversified services with distinct requirements over a common underlying physical infrastructure. Therein, demand-aw are resource allocation is of significant importance to network slicing. In this paper, we consider a scenario that contains several slices in a radio access network with base stations that share the same physical resources (e.g., bandwidth or slots). We leverage deep reinforcement learning (DRL) to solve this problem by considering the varying service demands as the environment state and the allocated resources as the environment action. In order to reduce the effects of the annoying randomness and noise embedded in the received service level agreement (SLA) satisfaction ratio (SSR) and spectrum efficiency (SE), we primarily propose generative adversarial network-powered deep distributional Q network (GAN-DDQN) to learn the action-value distribution driven by minimizing the discrepancy between the estimated action-value distribution and the target action-value distribution. We put forward a reward-clipping mechanism to stabilize GAN-DDQN training against the effects of widely-spanning utility values. Moreover, we further develop Dueling GAN-DDQN, which uses a specially designed dueling generator, to learn the action-value distribution by estimating the state-value distribution and the action advantage function. Finally, we verify the performance of the proposed GAN-DDQN and Dueling GAN-DDQN algorithms through extensive simulations.

التعلم الآلي الذكاء الاصطناعي بنية الشبكات والإنترنت