When Multiple Agents Learn to Schedule: A Distributed Radio Resource Management Framework

169 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Navid Naderializadeh

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Navid Naderializadeh - Jaroslaw Sydir - Meryem Simsek

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Interference among concurrent transmissions in a wireless network is a key factor limiting the system performance. One way to alleviate this problem is to manage the radio resources in order to maximize either the average or the worst-case performance. However, joint consideration of both metrics is often neglected as they are competing in nature. In this article, a mechanism for radio resource management using multi-agent deep reinforcement learning (RL) is proposed, which strikes the right trade-off between maximizing the average and the $5^{th}$ percentile user throughput. Each transmitter in the network is equipped with a deep RL agent, receiving partial observations from the network (e.g., channel quality, interference level, etc.) and deciding whether to be active or inactive at each scheduling interval for given radio resources, a process referred to as link scheduling. Based on the actions of all agents, the network emits a reward to the agents, indicating how good their joint decisions were. The proposed framework enables the agents to make decisions in a distributed manner, and the reward is designed in such a way that the agents strive to guarantee a minimum performance, leading to a fair resource allocation among all users across the network. Simulation results demonstrate the superiority of our approach compared to decentralized baselines in terms of average and $5^{th}$ percentile user throughput, while achieving performance close to that of a centralized exhaustive search approach. Moreover, the proposed framework is robust to mismatches between training and testing scenarios. In particular, it is shown that an agent trained on a network with low transmitter density maintains its performance and outperforms the baselines when deployed in a network with a higher transmitter density.

قيم البحث

76 - Matteo Pagin , Tommaso Zugno , Michele Polese 2021

The next generations of mobile networks will be deployed as ultra-dense networks, to match the demand for increased capacity and the challenges that communications in the higher portion of the spectrum (i.e., the mmWave band) introduce. Ultra-dense n etworks, however, require pervasive, high-capacity backhaul solutions, and deploying fiber optic to all base stations is generally considered to be too expensive for network operators. The 3rd Generation Partnership Project (3GPP) has thus introduced Integrated Access and Backhaul (IAB), a wireless backhaul solution in which the access and backhaul links share the same hardware, protocol stack, and also spectrum. The multiplexing of different links in the same frequency bands, however, introduces interference and capacity sharing issues, thus calling for the introduction of advanced scheduling and coordination schemes. This paper proposes a semi-centralized resource allocation scheme for IAB networks, designed to be flexible, with low complexity, and compliant with the 3GPP IAB specifications. We develop a version of the Maximum Weighted Matching (MWM) problem that can be applied on a spanning tree that represents the IAB network and whose complexity is linear in the number of IAB-nodes. The proposed solution is compared with state-of-the-art distributed approaches through end-to-end, full-stack system-level simulations with a 3GPP-compliant channel model, protocol stack, and a diverse set of user applications. The results show how that our scheme can increase the throughput of cell-edge users up to 5 times, while decreasing the overall network congestion with an end-to-end delay reduction of up to 25 times.

بنية الشبكات والإنترنت نظرية المعلومات نظرية المعلومات

Heterogeneously-Distributed Joint Radar Communications: Bayesian Resource Allocation

259 - Linlong Wu , Kumar Vijay Mishra , Bhavani Shankar M. R. 2021

Due to spectrum scarcity, the coexistence of radar and wireless communication has gained substantial research interest recently. Among many scenarios, the heterogeneouslydistributed joint radar-communication system is promising due to its flexibility and compatibility of existing architectures. In this paper, we focus on a heterogeneous radar and communication network (HRCN), which consists of various generic radars for multiple target tracking (MTT) and wireless communications for multiple users. We aim to improve the MTT performance and maintain good throughput levels for communication users by a well-designed resource allocation. The problem is formulated as a Bayesian Cramer-Rao bound (CRB) based minimization subjecting to resource budgets and throughput constraints. The formulated nonconvex problem is solved based on an alternating descent-ascent approach. Numerical results demonstrate the efficacy of the proposed allocation scheme for this heterogeneous network.

معالجة الإشارات نظرية المعلومات نظرية المعلومات

Distributed learning of deep neural network over multiple agents

99 - Otkrist Gupta , Ramesh Raskar 2018

In domains such as health care and finance, shortage of labeled data and computational resources is a critical issue while developing machine learning algorithms. To address the issue of labeled data scarcity in training and deployment of neural netw ork-based systems, we propose a new technique to train deep neural networks over several data sources. Our method allows for deep neural networks to be trained using data from multiple entities in a distributed fashion. We evaluate our algorithm on existing datasets and show that it obtains performance which is similar to a regular neural network trained on a single machine. We further extend it to incorporate semi-supervised learning when training with few labeled samples, and analyze any security concerns that may arise. Our algorithm paves the way for distributed training of deep neural networks in data sensitive applications when raw data may not be shared directly.

التعلم الآلي التعلم الالي

SURPRISE! and When to Schedule It

74 - Zhihuan Huang , Shengwei Xu , You Shan 2021

Information flow measures, over the duration of a game, the audiences belief of who will win, and thus can reflect the amount of surprise in a game. To quantify the relationship between information flow and audiences perceived quality, we conduct a c ase study where subjects watch one of the worlds biggest esports events, LOL S10. In addition to eliciting information flow, we also ask subjects to report their rating for each game. We find that the amount of surprise in the end of the game plays a dominant role in predicting the rating. This suggests the importance of incorporating when the surprise occurs, in addition to the amount of surprise, in perceived quality models. For content providers, it implies that everything else being equal, it is better for twists to be more likely to happen toward the end of a show rather than uniformly throughout.

أنظمة متعددة العملاء الفيزياء والمجتمع

When should agents explore?

141 - Miruna P^islar , David Szepesvari , Georg Ostrovski 2021

Exploration remains a central challenge for reinforcement learning (RL). Virtually all existing methods share the feature of a monolithic behaviour policy that changes only gradually (at best). In contrast, the exploratory behaviours of animals and h umans exhibit a rich diversity, namely including forms of switching between modes. This paper presents an initial study of mode-switching, non-monolithic exploration for RL. We investigate different modes to switch between, at what timescales it makes sense to switch, and what signals make for good switching triggers. We also propose practical algorithmic components that make the switching mechanism adaptive and robust, which enables flexibility without an accompanying hyper-parameter-tuning burden. Finally, we report a promising and detailed analysis on Atari, using two-mode exploration and switching at sub-episodic time-scales.

التعلم الآلي الذكاء الاصطناعي