ترغب بنشر مسار تعليمي؟ اضغط هنا

Trajectory Design for UAV-Based Internet-of-Things Data Collection: A Deep Reinforcement Learning Approach

190   0   0.0 ( 0 )
 نشر من قبل Zhen Gao
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

In this paper, we investigate an unmanned aerial vehicle (UAV)-assisted Internet-of-Things (IoT) system in a sophisticated three-dimensional (3D) environment, where the UAVs trajectory is optimized to efficiently collect data from multiple IoT ground nodes. Unlike existing approaches focusing only on a simplified two-dimensional scenario and the availability of perfect channel state information (CSI), this paper considers a practical 3D urban environment with imperfect CSI, where the UAVs trajectory is designed to minimize data collection completion time subject to practical throughput and flight movement constraints. Specifically, inspired from the state-of-the-art deep reinforcement learning approaches, we leverage the twin-delayed deep deterministic policy gradient (TD3) to design the UAVs trajectory and present a TD3-based trajectory design for completion time minimization (TD3-TDCTM) algorithm. In particular, we set an additional information, i.e., the merged pheromone, to represent the state information of UAV and environment as a reference of reward which facilitates the algorithm design. By taking the service statuses of IoT nodes, the UAVs position, and the merged pheromone as input, the proposed algorithm can continuously and adaptively learn how to adjust the UAVs movement strategy. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can achieve a near-optimal navigation strategy. Our simulation results show the superiority of the proposed TD3-TDCTM algorithm over three conventional non-learning based baseline methods.

قيم البحث

اقرأ أيضاً

197 - Mengjie Yi , Xijun Wang , Juan Liu 2020
Due to the flexibility and low operational cost, dispatching unmanned aerial vehicles (UAVs) to collect information from distributed sensors is expected to be a promising solution in Internet of Things (IoT), especially for time-critical applications . How to maintain the information freshness is a challenging issue. In this paper, we investigate the fresh data collection problem in UAV-assisted IoT networks. Particularly, the UAV flies towards the sensors to collect status update packets within a given duration while maintaining a non-negative residual energy. We formulate a Markov Decision Process (MDP) to find the optimal flight trajectory of the UAV and transmission scheduling of the sensors that minimizes the weighted sum of the age of information (AoI). A UAV-assisted data collection algorithm based on deep reinforcement learning (DRL) is further proposed to overcome the curse of dimensionality. Extensive simulation results demonstrate that the proposed DRL-based algorithm can significantly reduce the weighted sum of the AoI compared to other baseline algorithms.
109 - Yao Tang , Man Hon Cheung , 2019
Unmanned aerial vehicles (UAVs) can enhance the performance of cellular networks, due to their high mobility and efficient deployment. In this paper, we present a first study on how the user mobility affects the UAVs trajectories of a multiple-UAV as sisted wireless communication system. Specifically, we consider the UAVs are deployed as aerial base stations to serve ground users who move between different regions. We maximize the throughput of ground users in the downlink communication by optimizing the UAVs trajectories, while taking into account the impact of the user mobility, propulsion energy consumption, and UAVs mutual interference. We formulate the problem as a route selection problem in an acyclic directed graph. Each vertex represents a task associated with a reward on the average user throughput in a region-time point, while each edge is associated with a cost on the energy propulsion consumption during flying and hovering. For the centralized trajectory design, we first propose the shortest path scheme that determines the optimal trajectory for the single UAV case. We also propose the centralized route selection (CRS) scheme to systematically compute the optimal trajectories for the more general multiple-UAV case. Due to the NP-hardness of the centralized problem, we consider the distributed trajectory design that each UAV selects its trajectory autonomously and propose the distributed route selection (DRS) scheme, which will converge to a pure strategy Nash equilibrium within a finite number of iterations.
310 - Yuwei Huang , Xiaopeng Mo , Jie Xu 2019
This paper considers an unmanned aerial vehicle enabled-up link non-orthogonal multiple-access system, where multiple mobile users on the ground send independent messages to a unmanned aerial vehicle in the sky via non-orthogonal multiple-access tran smission. Our objective is to design the unmanned aerial vehicle dynamic maneuver for maximizing the sum-rate throughput of all mobile ground users over a finite time horizon.
Unmanned aerial vehicles (UAVs) are now beginning to be deployed for enhancing the network performance and coverage in wireless communication. However, due to the limitation of their on-board power and flight time, it is challenging to obtain an opti mal resource allocation scheme for the UAV-assisted Internet of Things (IoT). In this paper, we design a new UAV-assisted IoT systems relying on the shortest flight path of the UAVs while maximising the amount of data collected from IoT devices. Then, a deep reinforcement learning-based technique is conceived for finding the optimal trajectory and throughput in a specific coverage area. After training, the UAV has the ability to autonomously collect all the data from user nodes at a significant total sum-rate improvement while minimising the associated resources used. Numerical results are provided to highlight how our techniques strike a balance between the throughput attained, trajectory, and the time spent. More explicitly, we characterise the attainable performance in terms of the UAV trajectory, the expected reward and the total sum-rate.
185 - Q. Liu , L. Shi , L. Sun 2020
In this letter, we study an unmanned aerial vehicle (UAV)-mounted mobile edge computing network, where the UAV executes computational tasks offloaded from mobile terminal users (TUs) and the motion of each TU follows a Gauss-Markov random model. To e nsure the quality-of-service (QoS) of each TU, the UAV with limited energy dynamically plans its trajectory according to the locations of mobile TUs. Towards this end, we formulate the problem as a Markov decision process, wherein the UAV trajectory and UAV-TU association are modeled as the parameters to be optimized. To maximize the system reward and meet the QoS constraint, we develop a QoS-based action selection policy in the proposed algorithm based on double deep Q-network. Simulations show that the proposed algorithm converges more quickly and achieves a higher sum throughput than conventional algorithms.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا