A Reinforcement Learning Approach to Age of Information in Multi-User Networks with HARQ

127 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Elif Tugce Ceran

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Elif Tugce Ceran - Deniz Gunduz - Andras Gyorgy

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Scheduling the transmission of time-sensitive information from a source node to multiple users over error-prone communication channels is studied with the goal of minimizing the long-term average age of information (AoI) at the users. A long-term average resource constraint is imposed on the source, which limits the average number of transmissions. The source can transmit only to a single user at each time slot, and after each transmission, it receives an instantaneous ACK/NACK feedback from the intended receiver, and decides when and to which user to transmit the next update. Assuming the channel statistics are known, the optimal scheduling policy is studied for both the standard automatic repeat request (ARQ) and hybrid ARQ (HARQ) protocols. Then, a reinforcement learning(RL) approach is introduced to find a near-optimal policy, which does not assume any a priori information on the random processes governing the channel states. Different RL methods including average-cost SARSAwith linear function approximation (LFA), upper confidence reinforcement learning (UCRL2), and deep Q-network (DQN) are applied and compared through numerical simulations

قيم البحث

293 - Mohamed A. Abd-Elmagid , Aidin Ferdowsi , Harpreet S. Dhillon 2019

Unmanned aerial vehicles (UAVs) are expected to be a key component of the next-generation wireless systems. Due to their deployment flexibility, UAVs are being considered as an efficient solution for collecting information data from ground nodes and transmitting it wirelessly to the network. In this paper, a UAV-assisted wireless network is studied, in which energy-constrained ground nodes are deployed to observe different physical processes. In this network, a UAV that has a time constraint for its operation due to its limited battery, moves towards the ground nodes to receive status update packets about their observed processes. The flight trajectory of the UAV and scheduling of status update packets are jointly optimized with the objective of achieving the minimum weighted sum for the age-of-information (AoI) values of different processes at the UAV, referred to as weighted sum-AoI. The problem is modeled as a finite-horizon Markov decision process (MDP) with finite state and action spaces. Since the state space is extremely large, a deep reinforcement learning (RL) algorithm is proposed to obtain the optimal policy that minimizes the weighted sum-AoI, referred to as the age-optimal policy. Several simulation scenarios are considered to showcase the convergence of the proposed deep RL algorithm. Moreover, the results also demonstrate that the proposed deep RL approach can significantly improve the achievable sum-AoI per process compared to the baseline policies, such as the distance-based and random walk policies. The impact of various system design parameters on the optimal achievable sum-AoI per process is also shown through extensive simulations.

نظرية المعلومات بنية الشبكات والإنترنت نظرية المعلومات

A Reinforcement Learning Approach for Scheduling in mmWave Networks

241 - Mine Gokce Dogan , Yahya H. Ezzeldin , Christina Fragouli 2021

We consider a source that wishes to communicate with a destination at a desired rate, over a mmWave network where links are subject to blockage and nodes to failure (e.g., in a hostile military environment). To achieve resilience to link and node fai lures, we here explore a state-of-the-art Soft Actor-Critic (SAC) deep reinforcement learning algorithm, that adapts the information flow through the network, without using knowledge of the link capacities or network topology. Numerical evaluations show that our algorithm can achieve the desired rate even in dynamic environments and it is robust against blockage.

نظرية المعلومات التعلم الآلي نظرية المعلومات

Distributed Reinforcement Learning for Age of Information Minimization in Real-Time IoT Systems

119 - Sihua Wang , Mingzhe Chen , Zhaohui Yang 2021

In this paper, the problem of minimizing the weighted sum of age of information (AoI) and total energy consumption of Internet of Things (IoT) devices is studied. In the considered model, each IoT device monitors a physical process that follows nonli near dynamics. As the dynamics of the physical process vary over time, each device must find an optimal sampling frequency to sample the real-time dynamics of the physical system and send sampled information to a base station (BS). Due to limited wireless resources, the BS can only select a subset of devices to transmit their sampled information. Thus, edge devices must cooperatively sample their monitored dynamics based on the local observations and the BS must collect the sampled information from the devices immediately, hence avoiding the additional time and energy used for sampling and information transmission. To this end, it is necessary to jointly optimize the sampling policy of each device and the device selection scheme of the BS so as to accurately monitor the dynamics of the physical process using minimum energy. This problem is formulated as an optimization problem whose goal is to minimize the weighted sum of AoI cost and energy consumption. To solve this problem, we propose a novel distributed reinforcement learning (RL) approach for the sampling policy optimization. The proposed algorithm enables edge devices to cooperatively find the global optimal sampling policy using their own local observations. Given the sampling policy, the device selection scheme can be optimized thus minimizing the weighted sum of AoI and energy consumption of all devices. Simulations with real data of PM 2.5 pollution show that the proposed algorithm can reduce the sum of AoI by up to 17.8% and 33.9% and the total energy consumption by up to 13.2% and 35.1%, compared to a conventional deep Q network method and a uniform sampling policy.

نظرية المعلومات التعلم الآلي نظرية المعلومات

Deep Reinforcement Learning for IoT Networks: Age of Information and Energy Cost Tradeoff

441 - Xiongwei Wu , Xiuhua Li , Jun Li 2020

In most Internet of Things (IoT) networks, edge nodes are commonly used as to relays to cache sensing data generated by IoT sensors as well as provide communication services for data consumers. However, a critical issue of IoT sensing is that data ar e usually transient, which necessitates temporal updates of caching content items while frequent cache updates could lead to considerable energy cost and challenge the lifetime of IoT sensors. To address this issue, we adopt the Age of Information (AoI) to quantify data freshness and propose an online cache update scheme to obtain an effective tradeoff between the average AoI and energy cost. Specifically, we first develop a characterization of transmission energy consumption at IoT sensors by incorporating a successful transmission condition. Then, we model cache updating as a Markov decision process to minimize average weighted cost with judicious definitions of state, action, and reward. Since user preference towards content items is usually unknown and often temporally evolving, we therefore develop a deep reinforcement learning (DRL) algorithm to enable intelligent cache updates. Through trial-and-error explorations, an effective caching policy can be learned without requiring exact knowledge of content popularity. Simulation results demonstrate the superiority of the proposed framework.

نظرية المعلومات نظرية المعلومات

A Regression Approach to Certain Information Transmission Problems

88 - Wenyi Zhang , Yizhu Wang , Cong Shen 2019

A general information transmission model, under independent and identically distributed Gaussian codebook and nearest neighbor decoding rule with processed channel output, is investigated using the performance metric of generalized mutual information . When the encoder and the decoder know the statistical channel model, it is found that the optimal channel output processing function is the conditional expectation operator, thus hinting a potential role of regression, a classical topic in machine learning, for this model. Without utilizing the statistical channel model, a problem formulation inspired by machine learning principles is established, with suitable performance metrics introduced. A data-driven inference algorithm is proposed to solve the problem, and the effectiveness of the algorithm is validated via numerical experiments. Extensions to more general information transmission models are also discussed.

نظرية المعلومات التعلم الآلي نظرية المعلومات