No Arabic abstract
A novel non-orthogonal multiple access (NOMA) based cache-aided mobile edge computing (MEC) framework is proposed. For the purpose of efficiently allocating communication and computation resources to users computation tasks requests, we propose a long-short-term memory (LSTM) network to predict the task popularity. Based on the predicted task popularity, a long-term reward maximization problem is formulated that involves a joint optimization of the task offloading decisions, computation resource allocation, and caching decisions. To tackle this challenging problem, a single-agent Q-learning (SAQ-learning) algorithm is invoked to learn a long-term resource allocation strategy. Furthermore, a Bayesian learning automata (BLA) based multi-agent Q-learning (MAQ-learning) algorithm is proposed for task offloading decisions. More specifically, a BLA based action select scheme is proposed for the agents in MAQ-learning to select the optimal action in every state. We prove that the BLA based action selection scheme is instantaneously self-correcting and the selected action is an optimal solution for each state. Extensive simulation results demonstrate that: 1) The prediction error of the proposed LSTMs based task popularity prediction decreases with increasing learning rate. 2) The proposed framework significantly outperforms the benchmarks like all local computing, all offloading computing, and non-cache computing. 3) The proposed BLA based MAQ-learning achieves an improved performance compared to conventional reinforcement learning algorithms.
Given the proliferation of wireless sensors and smart mobile devices, an explosive escalation of the volume of data is anticipated. However, restricted by their limited physical sizes and low manufacturing costs, these wireless devices tend to have limited computational capabilities and battery lives. To overcome this limitation, wireless devices may offload their computational tasks to the nearby computing nodes at the network edge in mobile edge computing (MEC). At the time of writing, the benefits of MEC systems have not been fully exploited, predominately because the computation offloading link is still far from perfect. In this article, we propose to enhance MEC systems by exploiting the emerging technique of reconfigurable intelligent surfaces (RIS), which are capable of `reconfiguring the wireless propagation environments, hence enhancing the offloading links. The benefits of RISs can be maximized by jointly optimizing both the RISs as well as the communications and computing resource allocations of MEC systems. Unfortunately, this joint optimization imposes new research challenges on the system design. Against this background, this article provides an overview of RIS-assisted MEC systems and highlights their four use cases as well as their design challenges and solutions. Finally, their performance is characterized with the aid of a specific case study, followed by a range of future research ideas.
This letter investigates a sum rate maximizationproblem in an intelligent reflective surface (IRS) assisted non-orthogonal multiple access (NOMA) downlink network. Specif-ically, the sum rate of all the users is maximized by jointlyoptimizing the beams at the base station and the phase shiftat the IRS. The deep reinforcement learning (DRL), which hasachieved massive successes, is applied to solve this sum ratemaximization problem. In particular, an algorithm based on thedeep deterministic policy gradient (DDPG) is proposed. Both therandom channel case and the fixed channel case are studied inthis letter. The simulation result illustrates that the DDPG basedalgorithm has the competitive performance on both case.
In this paper, we explore optimization-based and data-driven solutions in a reconfigurable intelligent surface (RIS)-aided multi-user mobile edge computing (MEC) system, where the user equipment (UEs) can partially offload their computation tasks to the access point (AP). We aim at maximizing the total completed task-input bits (TCTB) of all UEs with limited energy budgets during a given time slot, through jointly optimizing the RIS reflecting coefficients, the APs receive beamforming vectors, and the UEs energy partition strategies for local computing and offloading. A three-step block coordinate descending (BCD) algorithm is first proposed to effectively solve the non-convex TCTB maximization problem with guaranteed convergence. In order to reduce the computational complexity and facilitate lightweight online implementation of the optimization algorithm, we further construct two deep learning architectures. The first one takes channel state information (CSI) as input, while the second one exploits the UEs locations only for online inference. The two data-driven approaches are trained using data samples generated by the BCD algorithm via supervised learning. Our simulation results reveal a close match between the performance of the optimization-based BCD algorithm and the low-complexity learning-based architectures, all with superior performance to existing schemes in both cases with perfect and imperfect input features. Importantly, the location-only deep learning method is shown to offer a particularly practical and robust solution alleviating the need for CSI estimation and feedback when line-of-sight (LoS) direct links exist between UEs and the AP.
In this paper, a joint task, spectrum, and transmit power allocation problem is investigated for a wireless network in which the base stations (BSs) are equipped with mobile edge computing (MEC) servers to jointly provide computational and communication services to users. Each user can request one computational task from three types of computational tasks. Since the data size of each computational task is different, as the requested computational task varies, the BSs must adjust their resource (subcarrier and transmit power) and task allocation schemes to effectively serve the users. This problem is formulated as an optimization problem whose goal is to minimize the maximal computational and transmission delay among all users. A multi-stack reinforcement learning (RL) algorithm is developed to solve this problem. Using the proposed algorithm, each BS can record the historical resource allocation schemes and users information in its multiple stacks to avoid learning the same resource allocation scheme and users states, thus improving the convergence speed and learning efficiency. Simulation results illustrate that the proposed algorithm can reduce the number of iterations needed for convergence and the maximal delay among all users by up to 18% and 11.1% compared to the standard Q-learning algorithm.
Mobile networks are experiencing tremendous increase in data volume and user density. An efficient technique to alleviate this issue is to bring the data closer to the users by exploiting the caches of edge network nodes, such as fixed or mobile access points and even user devices. Meanwhile, the fusion of machine learning and wireless networks offers a viable way for network optimization as opposed to traditional optimization approaches which incur high complexity, or fail to provide optimal solutions. Among the various machine learning categories, reinforcement learning operates in an online and autonomous manner without relying on large sets of historical data for training. In this survey, reinforcement learning-aided mobile edge caching is presented, aiming at highlighting the achieved network gains over conventional caching approaches. Taking into account the heterogeneity of sixth generation (6G) networks in various wireless settings, such as fixed, vehicular and flying networks, learning-aided edge caching is presented, departing from traditional architectures. Furthermore, a categorization according to the desirable performance metric, such as spectral, energy and caching efficiency, average delay, and backhaul and fronthaul offloading is provided. Finally, several open issues are discussed, targeting to stimulate further interest in this important research field.