ترغب بنشر مسار تعليمي؟ اضغط هنا

Deep Reinforcement Learning for Minimizing Age-of-Information in UAV-assisted Networks

294   0   0.0 ( 0 )
 نشر من قبل Mohamed A. Abd-Elmagid
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Unmanned aerial vehicles (UAVs) are expected to be a key component of the next-generation wireless systems. Due to their deployment flexibility, UAVs are being considered as an efficient solution for collecting information data from ground nodes and transmitting it wirelessly to the network. In this paper, a UAV-assisted wireless network is studied, in which energy-constrained ground nodes are deployed to observe different physical processes. In this network, a UAV that has a time constraint for its operation due to its limited battery, moves towards the ground nodes to receive status update packets about their observed processes. The flight trajectory of the UAV and scheduling of status update packets are jointly optimized with the objective of achieving the minimum weighted sum for the age-of-information (AoI) values of different processes at the UAV, referred to as weighted sum-AoI. The problem is modeled as a finite-horizon Markov decision process (MDP) with finite state and action spaces. Since the state space is extremely large, a deep reinforcement learning (RL) algorithm is proposed to obtain the optimal policy that minimizes the weighted sum-AoI, referred to as the age-optimal policy. Several simulation scenarios are considered to showcase the convergence of the proposed deep RL algorithm. Moreover, the results also demonstrate that the proposed deep RL approach can significantly improve the achievable sum-AoI per process compared to the baseline policies, such as the distance-based and random walk policies. The impact of various system design parameters on the optimal achievable sum-AoI per process is also shown through extensive simulations.

قيم البحث

اقرأ أيضاً

197 - Mengjie Yi , Xijun Wang , Juan Liu 2020
Due to the flexibility and low operational cost, dispatching unmanned aerial vehicles (UAVs) to collect information from distributed sensors is expected to be a promising solution in Internet of Things (IoT), especially for time-critical applications . How to maintain the information freshness is a challenging issue. In this paper, we investigate the fresh data collection problem in UAV-assisted IoT networks. Particularly, the UAV flies towards the sensors to collect status update packets within a given duration while maintaining a non-negative residual energy. We formulate a Markov Decision Process (MDP) to find the optimal flight trajectory of the UAV and transmission scheduling of the sensors that minimizes the weighted sum of the age of information (AoI). A UAV-assisted data collection algorithm based on deep reinforcement learning (DRL) is further proposed to overcome the curse of dimensionality. Extensive simulation results demonstrate that the proposed DRL-based algorithm can significantly reduce the weighted sum of the AoI compared to other baseline algorithms.
In this paper, an unmanned aerial vehicle (UAV)-assisted wireless network is considered in which a battery-constrained UAV is assumed to move towards energy-constrained ground nodes to receive status updates about their observed processes. The UAVs f light trajectory and scheduling of status updates are jointly optimized with the objective of minimizing the normalized weighted sum of Age of Information (NWAoI) values for different physical processes at the UAV. The problem is first formulated as a mixed-integer program. Then, for a given scheduling policy, a convex optimization-based solution is proposed to derive the UAVs optimal flight trajectory and time instants on updates. However, finding the optimal scheduling policy is challenging due to the combinatorial nature of the formulated problem. Therefore, to complement the proposed convex optimization-based solution, a finite-horizon Markov decision process (MDP) is used to find the optimal scheduling policy. Since the state space of the MDP is extremely large, a novel neural combinatorial-based deep reinforcement learning (NCRL) algorithm using deep Q-network (DQN) is proposed to obtain the optimal policy. However, for large-scale scenarios with numerous nodes, the DQN architecture cannot efficiently learn the optimal scheduling policy anymore. Motivated by this, a long short-term memory (LSTM)-based autoencoder is proposed to map the state space to a fixed-size vector representation in such large-scale scenarios. A lower bound on the minimum NWAoI is analytically derived which provides system design guidelines on the appropriate choice of importance weights for different nodes. The numerical results also demonstrate that the proposed NCRL approach can significantly improve the achievable NWAoI per process compared to the baseline policies, such as weight-based and discretized state DQN policies.
Timeliness is an emerging requirement for many Internet of Things (IoT) applications. In IoT networks, where a large-number of nodes are distributed, severe interference may incur during the transmission phase which causes age of information (AoI) de gradation. It is therefore important to study the performance limit of AoI as well as how to achieve such limit. In this paper, we aim to optimize the AoI in random access Poisson networks. By taking into account the spatio-temporal interactions amongst the transmitters, an expression of the peak AoI is derived, based on explicit expressions of the optimal peak AoI and the corresponding optimal system parameters including the packet arrival rate and the channel access probability are further derived. It is shown that with a given packet arrival rate (resp. a given channel access probability), the optimal channel access probability (resp. the optimal packet arrival rate), is equal to one under a small node deployment density, and decrease monotonically as the spatial deployment density increases due to the severe interference caused by spatio-temproal coupling between transmitters. When joint tuning of the packet arrival rate and channel access probability is performed, the optimal channel access probability is always set to be one. Moreover, with the sole tuning of the channel access probability, it is found that the optimal peak AoI performance can be improved with a smaller packet arrival rate only when the node deployment density is high, which is contrast to the case of the sole tuning of the packet arrival rate, where a higher channel access probability always leads to better optimal peak AoI regardless of the node deployment density. In all the cases of optimal tuning of system parameters, the optimal peak AoI linearly grows with the node deployment density as opposed to an exponential growth with fixed system parameters.
495 - Zhiyuan Jiang 2020
In this paper, we adopt the fluid limits to analyze Age of Information (AoI) in a wireless multiaccess network with many users. We consider the case wherein users have heterogeneous i.i.d. channel conditions and the statuses are generate-at-will. Con vergence of the AoI occupancy measure to the fluid limit, represented by a Partial Derivative Equation (PDE), is proved within an approximation error inversely proportional to the number of users. Global convergence to the equilibrium of the PDE, i.e., stationary AoI distribution, is also proved. Based on this framework, it is shown that an existing AoI lower bound in the literature is in fact asymptotically tight, and a simple threshold policy, with the thresholds explicitly derived, achieves the optimum asymptotically. The proposed threshold-based policy is also much easier to decentralize than the widely-known index-based policies which require comparing user indices. To showcase the usability of the framework, we also use it to analyze the average non-linear AoI functions (with power and logarithm forms) in wireless networks. Again, explicit optimal threshold-based policies are derived, and average age functions proven. Simulation results show that even when the number of users is limited, e.g., $10$, the proposed policy and analysis are still effective.
441 - Xiongwei Wu , Xiuhua Li , Jun Li 2020
In most Internet of Things (IoT) networks, edge nodes are commonly used as to relays to cache sensing data generated by IoT sensors as well as provide communication services for data consumers. However, a critical issue of IoT sensing is that data ar e usually transient, which necessitates temporal updates of caching content items while frequent cache updates could lead to considerable energy cost and challenge the lifetime of IoT sensors. To address this issue, we adopt the Age of Information (AoI) to quantify data freshness and propose an online cache update scheme to obtain an effective tradeoff between the average AoI and energy cost. Specifically, we first develop a characterization of transmission energy consumption at IoT sensors by incorporating a successful transmission condition. Then, we model cache updating as a Markov decision process to minimize average weighted cost with judicious definitions of state, action, and reward. Since user preference towards content items is usually unknown and often temporally evolving, we therefore develop a deep reinforcement learning (DRL) algorithm to enable intelligent cache updates. Through trial-and-error explorations, an effective caching policy can be learned without requiring exact knowledge of content popularity. Simulation results demonstrate the superiority of the proposed framework.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا