ترغب بنشر مسار تعليمي؟ اضغط هنا

Distributed Learning in Ad-Hoc Networks: A Multi-player Multi-armed Bandit Framework

175   0   0.0 ( 0 )
 نشر من قبل Sumit Darak Dr
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Next-generation networks are expected to be ultra-dense with a very high peak rate but relatively lower expected traffic per user. For such scenario, existing central controller based resource allocation may incur substantial signaling (control communications) leading to a negative effect on the quality of service (e.g. drop calls), energy and spectrum efficiency. To overcome this problem, cognitive ad-hoc networks (CAHN) that share spectrum with other networks are being envisioned. They allow some users to identify and communicate in `free slots thereby reducing signaling load and allowing the higher number of users per base stations (dense networks). Such networks open up many interesting challenges such as resource identification, coordination, dynamic and context-aware adaptation for which Machine Learning and Artificial Intelligence framework offers novel solutions. In this paper, we discuss state-of-the-art multi-armed multi-player bandit based distributed learning algorithms that allow users to adapt to the environment and coordinate with other players/users. We also discuss various open research problems for feasible realization of CAHN and interesting applications in other domains such as energy harvesting, Internet of Things, and Smart grids.



قيم البحث

اقرأ أيضاً

We consider optimal resource allocation problems under asynchronous wireless network setting. Without explicit model knowledge, we design an unsupervised learning method based on Aggregation Graph Neural Networks (Agg-GNNs). Depending on the localize d aggregated information structure on each network node, the method can be learned globally and asynchronously while implemented locally. We capture the asynchrony by modeling the activation pattern as a characteristic of each node and train a policy-based resource allocation method. We also propose a permutation invariance property which indicates the transferability of the trained Agg-GNN. We finally verify our strategy by numerical simulations compared with baseline methods.
Setting up the future Internet of Things (IoT) networks will require to support more and more communicating devices. We prove that intelligent devices in unlicensed bands can use Multi-Armed Bandit (MAB) learning algorithms to improve resource exploi tation. We evaluate the performance of two classical MAB learning algorithms, UCB1 and Thompson Sampling, to handle the decentralized decision-making of Spectrum Access, applied to IoT networks; as well as learning performance with a growing number of intelligent end-devices. We show that using learning algorithms does help to fit more devices in such networks, even when all end-devices are intelligent and are dynamically changing channel. In the studied scenario, stochastic MAB learning provides a up to 16% gain in term of successful transmission probabilities, and has near optimal performance even in non-stationary and non-i.i.d. settings with a majority of intelligent devices.
This paper investigates learning-based caching in small-cell networks (SCNs) when user preference is unknown. The goal is to optimize the cache placement in each small base station (SBS) for minimizing the system long-term transmission delay. We mode l this sequential multi-agent decision making problem in a multi-agent multi-armed bandit (MAMAB) perspective. Rather than estimating user preference first and then optimizing the cache strategy, we propose several MAMAB-based algorithms to directly learn the cache strategy online in both stationary and non-stationary environment. In the stationary environment, we first propose two high-complexity agent-based collaborative MAMAB algorithms with performance guarantee. Then we propose a low-complexity distributed MAMAB which ignores the SBS coordination. To achieve a better balance between SBS coordination gain and computational complexity, we develop an edge-based collaborative MAMAB with the coordination graph edge-based reward assignment method. In the non-stationary environment, we modify the MAMAB-based algorithms proposed in the stationary environment by proposing a practical initialization method and designing new perturbed terms to adapt to the dynamic environment. Simulation results are provided to validate the effectiveness of our proposed algorithms. The effects of different parameters on caching performance are also discussed.
In this paper, we propose and evaluate a distributed protocol to manage trust diffusion in ad hoc networks. In this protocol, each node i maintains a trust value about an other node j which is computed both as a result of the exchanges with node j it self and as a function of the opinion that other nodes have about j. These two aspects are respectively weighted by a trust index that measures the trust quality the node has in its own experiences and by a trust index representing the trust the node has in the opinions of the other nodes. Simulations have been realized to validate the robustness of this protocol against three kinds of attacks: simple coalitions, Trojan attacks and detonator attacks.
We study a decentralized channel allocation problem in an ad-hoc Internet of Things network underlaying on the spectrum licensed to a primary cellular network. In the considered network, the impoverished channel sensing/probing capability and computa tional resource on the IoT devices make them difficult to acquire the detailed Channel State Information (CSI) for the shared multiple channels. In practice, the unknown patterns of the primary users transmission activities and the time-varying CSI (e.g., due to small-scale fading or device mobility) also cause stochastic changes in the channel quality. Decentralized IoT links are thus expected to learn channel conditions online based on partial observations, while acquiring no information about the channels that they are not operating on. They also have to reach an efficient, collision-free solution of channel allocation with limited coordination. Our study maps this problem into a contextual multi-player, multi-armed bandit game, and proposes a purely decentralized, three-stage policy learning algorithm through trial-and-error. Theoretical analyses shows that the proposed scheme guarantees the IoT links to jointly converge to the social optimal channel allocation with a sub-linear (i.e., polylogarithmic) regret with respect to the operational time. Simulations demonstrate that it strikes a good balance between efficiency and network scalability when compared with the other state-of-the-art decentralized bandit algorithms.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا