No Arabic abstract
Radio access network (RAN) slicing is an important part of network slicing in 5G. The evolving network architecture requires the orchestration of multiple network resources such as radio and cache resources. In recent years, machine learning (ML) techniques have been widely applied for network slicing. However, most existing works do not take advantage of the knowledge transfer capability in ML. In this paper, we propose a transfer reinforcement learning (TRL) scheme for joint radio and cache resources allocation to serve 5G RAN slicing.We first define a hierarchical architecture for the joint resources allocation. Then we propose two TRL algorithms: Q-value transfer reinforcement learning (QTRL) and action selection transfer reinforcement learning (ASTRL). In the proposed schemes, learner agents utilize the expert agents knowledge to improve their performance on target tasks. The proposed algorithms are compared with both the model-free Q-learning and the model-based priority proportional fairness and time-to-live (PPF-TTL) algorithms. Compared with Q-learning, QTRL and ASTRL present 23.9% lower delay for Ultra Reliable Low Latency Communications slice and 41.6% higher throughput for enhanced Mobile Broad Band slice, while achieving significantly faster convergence than Q-learning. Moreover, 40.3% lower URLLC delay and almost twice eMBB throughput are observed with respect to PPF-TTL.
The paper presents a reinforcement learning solution to dynamic resource allocation for 5G radio access network slicing. Available communication resources (frequency-time blocks and transmit powers) and computational resources (processor usage) are allocated to stochastic arrivals of network slice requests. Each request arrives with priority (weight), throughput, computational resource, and latency (deadline) requirements, and if feasible, it is served with available communication and computational resources allocated over its requested duration. As each decision of resource allocation makes some of the resources temporarily unavailable for future, the myopic solution that can optimize only the current resource allocation becomes ineffective for network slicing. Therefore, a Q-learning solution is presented to maximize the network utility in terms of the total weight of granted network slicing requests over a time horizon subject to communication and computational constraints. Results show that reinforcement learning provides major improvements in the 5G network utility relative to myopic, random, and first come first served solutions. While reinforcement learning sustains scalable performance as the number of served users increases, it can also be effectively used to assign resources to network slices when 5G needs to share the spectrum with incumbent users that may dynamically occupy some of the frequency-time blocks.
Network slicing is born as an emerging business to operators, by allowing them to sell the customized slices to various tenants at different prices. In order to provide better-performing and cost-efficient services, network slicing involves challenging technical issues and urgently looks forward to intelligent innovations to make the resource management consistent with users activities per slice. In that regard, deep reinforcement learning (DRL), which focuses on how to interact with the environment by trying alternative actions and reinforcing the tendency actions producing more rewarding consequences, is assumed to be a promising solution. In this paper, after briefly reviewing the fundamental concepts of DRL, we investigate the application of DRL in solving some typical resource management for network slicing scenarios, which include radio resource slicing and priority-based core network slicing, and demonstrate the advantage of DRL over several competing schemes through extensive simulations. Finally, we also discuss the possible challenges to apply DRL in network slicing from a general perspective.
Reinforcement learning (RL) for network slicing is considered in the 5G radio access network, where the base station, gNodeB, allocates resource blocks (RBs) to the requests of user equipments and maximizes the total reward of accepted requests over time. Based on adversarial machine learning, a novel over-the-air attack is introduced to manipulate the RL algorithm and disrupt 5G network slicing. Subject to an energy budget, the adversary observes the spectrum and builds its own RL-based surrogate model that selects which RBs to jam with the objective of maximizing the number of failed network slicing requests due to jammed RBs. By jamming the RBs, the adversary reduces the RL algorithms reward. As this reward is used as the input to update the RL algorithm, the performance does not recover even after the adversary stops jamming. This attack is evaluated in terms of the recovery time and the (maximum and total) reward loss, and it is shown to be much more effective than benchmark (random and myopic) jamming attacks. Different reactive and proactive defense mechanisms (protecting the RL algorithms updates or misleading the adversarys learning process) are introduced to show that it is viable to defend 5G network slicing against this attack.
In this paper, we propose a joint radio and core resource allocation framework for NFV-enabled networks. In the proposed system model, the goal is to maximize energy efficiency (EE), by guaranteeing end-to-end (E2E) quality of service (QoS) for different service types. To this end, we formulate an optimization problem in which power and spectrum resources are allocated in the radio part. In the core part, the chaining, placement, and scheduling of functions are performed to ensure the QoS of all users. This joint optimization problem is modeled as a Markov decision process (MDP), considering time-varying characteristics of the available resources and wireless channels. A soft actor-critic deep reinforcement learning (SAC-DRL) algorithm based on the maximum entropy framework is subsequently utilized to solve the above MDP. Numerical results reveal that the proposed joint approach based on the SAC-DRL algorithm could significantly reduce energy consumption compared to the case in which R-RA and NFV-RA problems are optimized separately.
Network slicing is a key technology in 5G communications system. Its purpose is to dynamically and efficiently allocate resources for diversified services with distinct requirements over a common underlying physical infrastructure. Therein, demand-aware resource allocation is of significant importance to network slicing. In this paper, we consider a scenario that contains several slices in a radio access network with base stations that share the same physical resources (e.g., bandwidth or slots). We leverage deep reinforcement learning (DRL) to solve this problem by considering the varying service demands as the environment state and the allocated resources as the environment action. In order to reduce the effects of the annoying randomness and noise embedded in the received service level agreement (SLA) satisfaction ratio (SSR) and spectrum efficiency (SE), we primarily propose generative adversarial network-powered deep distributional Q network (GAN-DDQN) to learn the action-value distribution driven by minimizing the discrepancy between the estimated action-value distribution and the target action-value distribution. We put forward a reward-clipping mechanism to stabilize GAN-DDQN training against the effects of widely-spanning utility values. Moreover, we further develop Dueling GAN-DDQN, which uses a specially designed dueling generator, to learn the action-value distribution by estimating the state-value distribution and the action advantage function. Finally, we verify the performance of the proposed GAN-DDQN and Dueling GAN-DDQN algorithms through extensive simulations.