Cache Placement Optimization in Mobile Edge Computing Networks with Unaware Environment -- An Extended Multi-armed Bandit Approach

53 0 0.0 ( 0 )

Download Cite

Added by Yuqi Han

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Yuqi Han - Rui Wang - Jun Wu

Networking and Internet Architecture

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Caching high-frequency reuse contents at the edge servers in the mobile edge computing (MEC) network omits the part of backhaul transmission and further releases the pressure of data traffic. However, how to efficiently decide the caching contents for edge servers is still an open problem, which refers to the cache capacity of edge servers, the popularity of each content, and the wireless channel quality during transmission. In this paper, we discuss the influence of unknown user density and popularity of content on the cache placement solution at the edge server. Specifically, towards the implementation of the cache placement solution in the practical network, there are two problems needing to be solved. First, the estimation of unknown users preference needs a huge amount of records of users previous requests. Second, the overlapping serving regions among edge servers cause the wrong estimation of users preference, which hinders the individual decision of caching placement. To address the first issue, we propose a learning-based solution to adaptively optimize the cache placement policy. We develop the extended multi-armed bandit (Extended MAB), which combines the generalized global bandit (GGB) and Standard Multi-armed bandit (MAB). For the second problem, a multi-agent Extended MAB-based solution is presented to avoid the mis-estimation of parameters and achieve the decentralized cache placement policy. The proposed solution determines the primary time slot and secondary time slot for each edge server. The proposed strategies are proven to achieve the bounded regret according to the mathematical analysis. Extensive simulations verify the optimality of the proposed strategies when comparing with baselines.

rate research

Spatio-temporal Edge Service Placement: A Bandit Learning Approach

67 - Lixing Chen , Jie Xu , Shaolei Ren 2018

Shared edge computing platforms deployed at the radio access network are expected to significantly improve quality of service delivered by Application Service Providers (ASPs) in a flexible and economic way. However, placing edge service in every possible edge site by an ASP is practically infeasible due to the ASPs prohibitive budget requirement. In this paper, we investigate the edge service placement problem of an ASP under a limited budget, where the ASP dynamically rents computing/storage resources in edge sites to host its applications in close proximity to end users. Since the benefit of placing edge service in a specific site is usually unknown to the ASP a priori, optimal placement decisions must be made while learning this benefit. We pose this problem as a novel combinatorial contextual bandit learning problem. It is combinatorial because only a limited number of edge sites can be rented to provide the edge service given the ASPs budget. It is contextual because we utilize user context information to enable finer-grained learning and decision making. To solve this problem and optimize the edge computing performance, we propose SEEN, a Spatial-temporal Edge sErvice placemeNt algorithm. Furthermore, SEEN is extended to scenarios with overlapping service coverage by incorporating a disjunctively constrained knapsack problem. In both cases, we prove that our algorithm achieves a sublinear regret bound when it is compared to an oracle algorithm that knows the exact benefit information. Simulations are carried out on a real-world dataset, whose results show that SEEN significantly outperforms benchmark solutions.

Networking and Internet Architecture Artificial Intelligence

Outdoor mmWave Base Station Placement: A Multi-Armed Bandit Learning Approach

140 - Fatih Erden , Chethan K. Anjinappa , Ender Ozturk 2020

Base station (BS) placement in mobile networks is critical to the efficient use of resources in any communication system and one of the main factors that determines the quality of communication. Although there is ample literature on the optimum placement of BSs for sub-6 GHz bands, channel propagation characteristics, such as penetration loss, are notably different in millimeter-wave (mmWave) bands than in sub-6 GHz bands. Therefore, designated solutions are needed for mmWave systems to have reliable quality of service (QoS) assessment. This article proposes a multi-armed bandit (MAB) learning approach for the mmWave BS placement problem. The proposed solution performs viewshed analysis to identify the areas that are visible to a given BS location by considering the 3D geometry of the outdoor environments. Coverage probability, which is used as the QoS metric, is calculated using the appropriate path loss model depending on the viewshed analysis and a probabilistic blockage model and then fed to the MAB learning mechanism. The optimum BS location is then determined based on the expected reward that the candidate locations attain at the end of the training process. Unlike the optimization-based techniques, this method can capture the time-varying behavior of the channel and find the optimal BS locations that maximize long-term performance.

Signal Processing

Collaborative Multi-bitrate Video Caching and Processing in Mobile-Edge Computing Networks

376 - Tuyen X. Tran , Parul Pandey , Abolfazl Hajisami 2016

Recently, Mobile-Edge Computing (MEC) has arisen as an emerging paradigm that extends cloud-computing capabilities to the edge of the Radio Access Network (RAN) by deploying MEC servers right at the Base Stations (BSs). In this paper, we envision a collaborative joint caching and processing strategy for on-demand video streaming in MEC networks. Our design aims at enhancing the widely used Adaptive BitRate (ABR) streaming technology, where multiple bitra

Networking and Internet Architecture

Distributed Learning in Ad-Hoc Networks: A Multi-player Multi-armed Bandit Framework

174 - Sumit J. Darak , Manjesh K.Hanawal 2020

Next-generation networks are expected to be ultra-dense with a very high peak rate but relatively lower expected traffic per user. For such scenario, existing central controller based resource allocation may incur substantial signaling (control communications) leading to a negative effect on the quality of service (e.g. drop calls), energy and spectrum efficiency. To overcome this problem, cognitive ad-hoc networks (CAHN) that share spectrum with other networks are being envisioned. They allow some users to identify and communicate in `free slots thereby reducing signaling load and allowing the higher number of users per base stations (dense networks). Such networks open up many interesting challenges such as resource identification, coordination, dynamic and context-aware adaptation for which Machine Learning and Artificial Intelligence framework offers novel solutions. In this paper, we discuss state-of-the-art multi-armed multi-player bandit based distributed learning algorithms that allow users to adapt to the environment and coordinate with other players/users. We also discuss various open research problems for feasible realization of CAHN and interesting applications in other domains such as energy harvesting, Internet of Things, and Smart grids.

Networking and Internet Architecture Machine Learning Signal Processing

Cache-Aided NOMA Mobile Edge Computing: A Reinforcement Learning Approach

81 - Zhong Yang , Yuanwei Liu , Yue Chen 2019

A novel non-orthogonal multiple access (NOMA) based cache-aided mobile edge computing (MEC) framework is proposed. For the purpose of efficiently allocating communication and computation resources to users computation tasks requests, we propose a long-short-term memory (LSTM) network to predict the task popularity. Based on the predicted task popularity, a long-term reward maximization problem is formulated that involves a joint optimization of the task offloading decisions, computation resource allocation, and caching decisions. To tackle this challenging problem, a single-agent Q-learning (SAQ-learning) algorithm is invoked to learn a long-term resource allocation strategy. Furthermore, a Bayesian learning automata (BLA) based multi-agent Q-learning (MAQ-learning) algorithm is proposed for task offloading decisions. More specifically, a BLA based action select scheme is proposed for the agents in MAQ-learning to select the optimal action in every state. We prove that the BLA based action selection scheme is instantaneously self-correcting and the selected action is an optimal solution for each state. Extensive simulation results demonstrate that: 1) The prediction error of the proposed LSTMs based task popularity prediction decreases with increasing learning rate. 2) The proposed framework significantly outperforms the benchmarks like all local computing, all offloading computing, and non-cache computing. 3) The proposed BLA based MAQ-learning achieves an improved performance compared to conventional reinforcement learning algorithms.

Signal Processing