ﻻ يوجد ملخص باللغة العربية
Traversing through a tilted narrow gap is previously an intractable task for reinforcement learning mainly due to two challenges. First, searching feasible trajectories is not trivial because the goal behind the gap is difficult to reach. Second, the error tolerance after Sim2Real is low due to the relatively high speed in comparison to the gaps narrow dimensions. This problem is aggravated by the intractability of collecting real-world data due to the risk of collision damage. In this paper, we propose an end-to-end reinforcement learning framework that solves this task successfully by addressing both problems. To search for dynamically feasible flight trajectories, we use curriculum learning to guide the agent towards the sparse reward behind the obstacle. To tackle the Sim2Real problem, we propose a Sim2Real framework that can transfer control commands to a real quadrotor without using real flight data. To the best of our knowledge, our paper is the first work that accomplishes successful gap traversing task purely using deep reinforcement learning.
We demonstrate the possibility of learning drone swarm controllers that are zero-shot transferable to real quadrotors via large-scale multi-agent end-to-end reinforcement learning. We train policies parameterized by neural networks that are capable o
This paper presented a deep reinforcement learning method named Double Deep Q-networks to design an end-to-end vision-based adaptive cruise control (ACC) system. A simulation environment of a highway scene was set up in Unity, which is a game engine
Deep reinforcement learning (RL) is a powerful framework to train decision-making models in complex dynamical environments. However, RL can be slow as it learns through repeated interaction with a simulation of the environment. Accelerating RL requir
The combination of deep neural network models and reinforcement learning algorithms can make it possible to learn policies for robotic behaviors that directly read in raw sensory inputs, such as camera images, effectively subsuming both estimation an
End-to-end approaches to autonomous driving commonly rely on expert demonstrations. Although humans are good drivers, they are not good coaches for end-to-end algorithms that demand dense on-policy supervision. On the contrary, automated experts that