ترغب بنشر مسار تعليمي؟ اضغط هنا

Deep Reinforcement Learning for IoT Networks: Age of Information and Energy Cost Tradeoff

442   0   0.0 ( 0 )
 نشر من قبل Xiongwei Wu
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In most Internet of Things (IoT) networks, edge nodes are commonly used as to relays to cache sensing data generated by IoT sensors as well as provide communication services for data consumers. However, a critical issue of IoT sensing is that data are usually transient, which necessitates temporal updates of caching content items while frequent cache updates could lead to considerable energy cost and challenge the lifetime of IoT sensors. To address this issue, we adopt the Age of Information (AoI) to quantify data freshness and propose an online cache update scheme to obtain an effective tradeoff between the average AoI and energy cost. Specifically, we first develop a characterization of transmission energy consumption at IoT sensors by incorporating a successful transmission condition. Then, we model cache updating as a Markov decision process to minimize average weighted cost with judicious definitions of state, action, and reward. Since user preference towards content items is usually unknown and often temporally evolving, we therefore develop a deep reinforcement learning (DRL) algorithm to enable intelligent cache updates. Through trial-and-error explorations, an effective caching policy can be learned without requiring exact knowledge of content popularity. Simulation results demonstrate the superiority of the proposed framework.



قيم البحث

اقرأ أيضاً

Unmanned aerial vehicles (UAVs) are expected to be a key component of the next-generation wireless systems. Due to their deployment flexibility, UAVs are being considered as an efficient solution for collecting information data from ground nodes and transmitting it wirelessly to the network. In this paper, a UAV-assisted wireless network is studied, in which energy-constrained ground nodes are deployed to observe different physical processes. In this network, a UAV that has a time constraint for its operation due to its limited battery, moves towards the ground nodes to receive status update packets about their observed processes. The flight trajectory of the UAV and scheduling of status update packets are jointly optimized with the objective of achieving the minimum weighted sum for the age-of-information (AoI) values of different processes at the UAV, referred to as weighted sum-AoI. The problem is modeled as a finite-horizon Markov decision process (MDP) with finite state and action spaces. Since the state space is extremely large, a deep reinforcement learning (RL) algorithm is proposed to obtain the optimal policy that minimizes the weighted sum-AoI, referred to as the age-optimal policy. Several simulation scenarios are considered to showcase the convergence of the proposed deep RL algorithm. Moreover, the results also demonstrate that the proposed deep RL approach can significantly improve the achievable sum-AoI per process compared to the baseline policies, such as the distance-based and random walk policies. The impact of various system design parameters on the optimal achievable sum-AoI per process is also shown through extensive simulations.
In this paper, the problem of minimizing the weighted sum of age of information (AoI) and total energy consumption of Internet of Things (IoT) devices is studied. In the considered model, each IoT device monitors a physical process that follows nonli near dynamics. As the dynamics of the physical process vary over time, each device must find an optimal sampling frequency to sample the real-time dynamics of the physical system and send sampled information to a base station (BS). Due to limited wireless resources, the BS can only select a subset of devices to transmit their sampled information. Thus, edge devices must cooperatively sample their monitored dynamics based on the local observations and the BS must collect the sampled information from the devices immediately, hence avoiding the additional time and energy used for sampling and information transmission. To this end, it is necessary to jointly optimize the sampling policy of each device and the device selection scheme of the BS so as to accurately monitor the dynamics of the physical process using minimum energy. This problem is formulated as an optimization problem whose goal is to minimize the weighted sum of AoI cost and energy consumption. To solve this problem, we propose a novel distributed reinforcement learning (RL) approach for the sampling policy optimization. The proposed algorithm enables edge devices to cooperatively find the global optimal sampling policy using their own local observations. Given the sampling policy, the device selection scheme can be optimized thus minimizing the weighted sum of AoI and energy consumption of all devices. Simulations with real data of PM 2.5 pollution show that the proposed algorithm can reduce the sum of AoI by up to 17.8% and 33.9% and the total energy consumption by up to 13.2% and 35.1%, compared to a conventional deep Q network method and a uniform sampling policy.
In this paper, an unmanned aerial vehicle (UAV)-assisted wireless network is considered in which a battery-constrained UAV is assumed to move towards energy-constrained ground nodes to receive status updates about their observed processes. The UAVs f light trajectory and scheduling of status updates are jointly optimized with the objective of minimizing the normalized weighted sum of Age of Information (NWAoI) values for different physical processes at the UAV. The problem is first formulated as a mixed-integer program. Then, for a given scheduling policy, a convex optimization-based solution is proposed to derive the UAVs optimal flight trajectory and time instants on updates. However, finding the optimal scheduling policy is challenging due to the combinatorial nature of the formulated problem. Therefore, to complement the proposed convex optimization-based solution, a finite-horizon Markov decision process (MDP) is used to find the optimal scheduling policy. Since the state space of the MDP is extremely large, a novel neural combinatorial-based deep reinforcement learning (NCRL) algorithm using deep Q-network (DQN) is proposed to obtain the optimal policy. However, for large-scale scenarios with numerous nodes, the DQN architecture cannot efficiently learn the optimal scheduling policy anymore. Motivated by this, a long short-term memory (LSTM)-based autoencoder is proposed to map the state space to a fixed-size vector representation in such large-scale scenarios. A lower bound on the minimum NWAoI is analytically derived which provides system design guidelines on the appropriate choice of importance weights for different nodes. The numerical results also demonstrate that the proposed NCRL approach can significantly improve the achievable NWAoI per process compared to the baseline policies, such as weight-based and discretized state DQN policies.
224 - Jie Gong , Xiang Chen , Xiao Ma 2018
Age-of-information is a novel performance metric in communication systems to indicate the freshness of the latest received data, which has wide applications in monitoring and control scenarios. Another important performance metric in these applicatio ns is energy consumption, since monitors or sensors are usually energy constrained. In this paper, we study the energy-age tradeoff in a status update system where data transmission from a source to a receiver may encounter failure due to channel error. As the status sensing process consumes energy, when a transmission failure happens, the source may either retransmit the existing data to save energy for sensing, or sense and transmit a new update to minimize age-of-information. A threshold-based retransmission policy is considered where each update is allowed to be transmitted no more than M times. Closed-form average age-of-information and energy consumption is derived and expressed as a function of channel failure probability and maximum number of retransmissions M. Numerical simulations validate our analytical results, and illustrate the tradeoff between average age-of-information and energy consumption.
197 - Mengjie Yi , Xijun Wang , Juan Liu 2020
Due to the flexibility and low operational cost, dispatching unmanned aerial vehicles (UAVs) to collect information from distributed sensors is expected to be a promising solution in Internet of Things (IoT), especially for time-critical applications . How to maintain the information freshness is a challenging issue. In this paper, we investigate the fresh data collection problem in UAV-assisted IoT networks. Particularly, the UAV flies towards the sensors to collect status update packets within a given duration while maintaining a non-negative residual energy. We formulate a Markov Decision Process (MDP) to find the optimal flight trajectory of the UAV and transmission scheduling of the sensors that minimizes the weighted sum of the age of information (AoI). A UAV-assisted data collection algorithm based on deep reinforcement learning (DRL) is further proposed to overcome the curse of dimensionality. Extensive simulation results demonstrate that the proposed DRL-based algorithm can significantly reduce the weighted sum of the AoI compared to other baseline algorithms.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا