No Arabic abstract
Unmanned aerial vehicles (UAVs) are now beginning to be deployed for enhancing the network performance and coverage in wireless communication. However, due to the limitation of their on-board power and flight time, it is challenging to obtain an optimal resource allocation scheme for the UAV-assisted Internet of Things (IoT). In this paper, we design a new UAV-assisted IoT systems relying on the shortest flight path of the UAVs while maximising the amount of data collected from IoT devices. Then, a deep reinforcement learning-based technique is conceived for finding the optimal trajectory and throughput in a specific coverage area. After training, the UAV has the ability to autonomously collect all the data from user nodes at a significant total sum-rate improvement while minimising the associated resources used. Numerical results are provided to highlight how our techniques strike a balance between the throughput attained, trajectory, and the time spent. More explicitly, we characterise the attainable performance in terms of the UAV trajectory, the expected reward and the total sum-rate.
In this paper, we investigate an unmanned aerial vehicle (UAV)-assisted Internet-of-Things (IoT) system in a sophisticated three-dimensional (3D) environment, where the UAVs trajectory is optimized to efficiently collect data from multiple IoT ground nodes. Unlike existing approaches focusing only on a simplified two-dimensional scenario and the availability of perfect channel state information (CSI), this paper considers a practical 3D urban environment with imperfect CSI, where the UAVs trajectory is designed to minimize data collection completion time subject to practical throughput and flight movement constraints. Specifically, inspired from the state-of-the-art deep reinforcement learning approaches, we leverage the twin-delayed deep deterministic policy gradient (TD3) to design the UAVs trajectory and present a TD3-based trajectory design for completion time minimization (TD3-TDCTM) algorithm. In particular, we set an additional information, i.e., the merged pheromone, to represent the state information of UAV and environment as a reference of reward which facilitates the algorithm design. By taking the service statuses of IoT nodes, the UAVs position, and the merged pheromone as input, the proposed algorithm can continuously and adaptively learn how to adjust the UAVs movement strategy. By interacting with the external environment in the corresponding Markov decision process, the proposed algorithm can achieve a near-optimal navigation strategy. Our simulation results show the superiority of the proposed TD3-TDCTM algorithm over three conventional non-learning based baseline methods.
Due to the flexibility and low operational cost, dispatching unmanned aerial vehicles (UAVs) to collect information from distributed sensors is expected to be a promising solution in Internet of Things (IoT), especially for time-critical applications. How to maintain the information freshness is a challenging issue. In this paper, we investigate the fresh data collection problem in UAV-assisted IoT networks. Particularly, the UAV flies towards the sensors to collect status update packets within a given duration while maintaining a non-negative residual energy. We formulate a Markov Decision Process (MDP) to find the optimal flight trajectory of the UAV and transmission scheduling of the sensors that minimizes the weighted sum of the age of information (AoI). A UAV-assisted data collection algorithm based on deep reinforcement learning (DRL) is further proposed to overcome the curse of dimensionality. Extensive simulation results demonstrate that the proposed DRL-based algorithm can significantly reduce the weighted sum of the AoI compared to other baseline algorithms.
A challenge for rescue teams when fighting against wildfire in remote areas is the lack of information, such as the size and images of fire areas. As such, live streaming from Unmanned Aerial Vehicles (UAVs), capturing videos of dynamic fire areas, is crucial for firefighter commanders in any location to monitor the fire situation with quick response. The 5G network is a promising wireless technology to support such scenarios. In this paper, we consider a UAV-to-UAV (U2U) communication scenario, where a UAV at a high altitude acts as a mobile base station (UAV-BS) to stream videos from other flying UAV-users (UAV-UEs) through the uplink. Due to the mobility of the UAV-BS and UAV-UEs, it is important to determine the optimal movements and transmission powers for UAV-BSs and UAV-UEs in real-time, so as to maximize the data rate of video transmission with smoothness and low latency, while mitigating the interference according to the dynamics in fire areas and wireless channel conditions. In this paper, we co-design the video resolution, the movement, and the power control of UAV-BS and UAV-UEs to maximize the Quality of Experience (QoE) of real-time video streaming. To learn the Deep Q-Network (DQN) and Actor-Critic (AC) to maximize the QoE of video transmission from all UAV-UEs to a single UAVBS. Simulation results show the effectiveness of our proposed algorithm in terms of the QoE, delay and video smoothness as compared to the Greedy algorithm.
This work considers unmanned aerial vehicle (UAV) networks for collecting data covertly from ground users. The full-duplex UAV intends to gather critical information from a scheduled user (SU) through wireless communication and generate artificial noise (AN) with random transmit power in order to ensure a negligible probability of the SUs transmission being detected by the unscheduled users (USUs). To enhance the system performance, we jointly design the UAVs trajectory and its maximum AN transmit power together with the user scheduling strategy subject to practical constraints, e.g., a covertness constraint, which is explicitly determined by analyzing each USUs detection performance, and a binary constraint induced by user scheduling. The formulated design problem is a mixed-integer non-convex optimization problem, which is challenging to solve directly, but tackled by our developed penalty successive convex approximation (P-SCA) scheme. An efficient UAV trajectory initialization is also presented based on the Successive Hover-and-Fly (SHAF) trajectory, which also serves as a benchmark scheme. Our examination shows the developed P-SCA scheme significantly outperforms the benchmark scheme in terms of achieving a higher max-min average transmission rate from all the SUs to the UAV.
Real-time applications of energy management strategies (EMSs) in hybrid electric vehicles (HEVs) are the harshest requirements for researchers and engineers. Inspired by the excellent problem-solving capabilities of deep reinforcement learning (DRL), this paper proposes a real-time EMS via incorporating the DRL method and transfer learning (TL). The related EMSs are derived from and evaluated on the real-world collected driving cycle dataset from Transportation Secure Data Center (TSDC). The concrete DRL algorithm is proximal policy optimization (PPO) belonging to the policy gradient (PG) techniques. For specification, many source driving cycles are utilized for training the parameters of deep network based on PPO. The learned parameters are transformed into the target driving cycles under the TL framework. The EMSs related to the target driving cycles are estimated and compared in different training conditions. Simulation results indicate that the presented transfer DRL-based EMS could effectively reduce time consumption and guarantee control performance.