No Arabic abstract
In this paper, we design a novel scheduling and resource allocation algorithm for a smart mobile edge computing (MEC) assisted radio access network. Different from previous energy efficiency (EE) based or the average age of information (AAoI)-based network designs, we propose a unified metric for simultaneously optimizing ESE and AAoI of the network. To further improve the system capacity, non-orthogonal multiple access (NOMA) is proposed as a candidate for multiple access schemes for future cellular networks. Our main aim is to maximize the long-term objective function under AoI, NOMA, and resource capacity constraints using stochastic optimization. To overcome the complexities and unknown dynamics of the network parameters (e.g., wireless channel and interference), we apply the concept of reinforcement learning and implement a deep Q-network (DQN). Simulation results illustrate the effectiveness of the proposed framework and analyze different parameters impact on network performance. Based on the results, our proposed reward function converges fast with negligible loss value. Also, they illustrate our work outperforms the existing state of the art baselines up to 64% in the objective function and 51% in AAoI, which are stated as examples.
This paper investigates the application of deep deterministic policy gradient (DDPG) to intelligent reflecting surface (IRS) based unmanned aerial vehicles (UAV) assisted non-orthogonal multiple access (NOMA) downlink networks. The deployment of the UAV equipped with an IRS is important, as the UAV increases the flexibility of the IRS significantly, especially for the case of users who have no line of sight (LoS) path to the base station (BS). Therefore, the aim of this letter is to maximize the sum rate by jointly optimizing the power allocation of the BS, the phase shifting of the IRS and the horizontal position of the UAV. Because the formulated problem is not convex, the DDPG algorithm is utilized to solve it. The computer simulation results are provided to show the superior performance of the proposed DDPG based algorithm.
This paper investigates the problem of age of information (AoI) aware radio resource management for a platooning system. Multiple autonomous platoons exploit the cellular wireless vehicle-to-everything (C-V2X) communication technology to disseminate the cooperative awareness messages (CAMs) to their followers while ensuring timely delivery of safety-critical messages to the Road-Side Unit (RSU). Due to the challenges of dynamic channel conditions, centralized resource management schemes that require global information are inefficient and lead to large signaling overheads. Hence, we exploit a distributed resource allocation framework based on multi-agent reinforcement learning (MARL), where each platoon leader (PL) acts as an agent and interacts with the environment to learn its optimal policy. Existing MARL algorithms consider a holistic reward function for the groups collective success, which often ends up with unsatisfactory results and cannot guarantee an optimal policy for each agent. Consequently, motivated by the existing literature in RL, we propose a novel MARL framework that trains two critics with the following goals: A global critic which estimates the global expected reward and motivates the agents toward a cooperating behavior and an exclusive local critic for each agent that estimates the local individual reward. Furthermore, based on the tasks each agent has to accomplish, the individual reward of each agent is decomposed into multiple sub-reward functions where task-wise value functions are learned separately. Numerical results indicate our proposed algorithms effectiveness compared with the conventional RL methods applied in this area.
The combination of non-orthogonal multiple access (NOMA) and mobile edge computing (MEC) can significantly improve the spectrum efficiency beyond the fifth-generation network. In this paper, we mainly focus on energy-efficient resource allocation for a multi-user, multi-BS NOMA assisted MEC network with imperfect channel state information (CSI), in which each user can upload its tasks to multiple base stations (BSs) for remote executions. To minimize the energy consumption, we consider jointly optimizing the task assignment, power allocation and user association. As the main contribution, with imperfect CSI, the optimal closed-form expressions of task assignment and power allocation are analytically derived for the two-BS case. Specifically, the original formulated problem is nonconvex. We first transform the probabilistic problem into a non-probabilistic one. Subsequently, a bilevel programming method is proposed to derive the optimal solution. In addition, by incorporating the matching algorithm with the optimal task and power allocation, we propose a low complexity algorithm to efficiently optimize user association for the multi-user and multi-BS case. Simulations demonstrate that the proposed algorithm can yield much better performance than the conventional OMA scheme but also the identical results with lower complexity from the exhaustive search with the small number of BSs.
Grant-free non-orthogonal multiple access (GF-NOMA) is a potential technique to support massive Ultra-Reliable and Low-Latency Communication (mURLLC) service. However, the dynamic resource configuration in GF-NOMA systems is challenging due to random traffics and collisions, that are unknown at the base station (BS). Meanwhile, joint consideration of the latency and reliability requirements makes the resource configuration of GF-NOMA for mURLLC more complex. To address this problem, we develop a general learning framework for signature-based GF-NOMA in mURLLC service taking into account the multiple access signature collision, the UE detection, as well as the data decoding procedures for the K-repetition GF and the Proactive GF schemes. The goal of our learning framework is to maximize the long-term average number of successfully served users (UEs) under the latency constraint. We first perform a real-time repetition value configuration based on a double deep Q-Network (DDQN) and then propose a Cooperative Multi-Agent learning technique based on the DQN (CMA-DQN) to optimize the configuration of both the repetition values and the contention-transmission unit (CTU) numbers. Our results show that the number of successfully served UEs under the same latency constraint in our proposed learning framework is up to ten times for the K-repetition scheme, and two times for the Proactive scheme, more than that with fixed repetition values and CTU numbers. In addition, the superior performance of CMA-DQN over the conventional load estimation-based approach (LE-URC) demonstrates its capability in dynamically configuring in long term. Importantly, our general learning framework can be used to optimize the resource configuration problems in all the signature-based GF-NOMA schemes.
Multi-access edge computing (MEC) can enhance the computing capability of mobile devices, while non-orthogonal multiple access (NOMA) can provide high data rates. Combining these two strategies can effectively benefit the network with spectrum and energy efficiency. In this paper, we investigate the task delay minimization in multi-user NOMA-MEC networks, where multiple users can offload their tasks simultaneously through the same frequency band. We adopt the partial offloading policy, in which each user can partition its computation task into offloading and locally computing parts. We aim to minimize the task delay among users by optimizing their tasks partition ratios and offloading transmit power. The delay minimization problem is first formulated, and it is shown that it is a nonconvex one. By carefully investigating its structure, we transform the original problem into an equivalent quasi-convex. In this way, a bisection search iterative algorithm is proposed in order to achieve the minimum task delay. To reduce the complexity of the proposed algorithm and evaluate its optimality, we further derive closed-form expressions for the optimal task partition ratio and offloading power for the case of two-user NOMA-MEC networks. Simulations demonstrate the convergence and optimality of the proposed algorithm and the effectiveness of the closed-form analysis.