Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Perceive, Attend, and Drive: Learning Spatial Attention for Safe Self-Driving

113 0 0.0 ( 0 )

Download Cite

Added by Mengye Ren

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Bob Wei - Mengye Ren - Wenyuan Zeng

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this paper, we propose an end-to-end self-driving network featuring a sparse attention module that learns to automatically attend to important regions of the input. The attention module specifically targets motion planning, whereas prior literature only applied attention in perception tasks. Learning an attention mask directly targeted for motion planning significantly improves the planner safety by performing more focused computation. Furthermore, visualizing the attention improves interpretability of end-to-end self-driving.

rate research

DQ-GAT: Towards Safe and Efficient Autonomous Driving with Deep Q-Learning and Graph Attention Networks

190 - Peide Cai , Hengli Wang , Yuxiang Sun 2021

Autonomous driving in multi-agent and dynamic traffic scenarios is challenging, where the behaviors of other road agents are uncertain and hard to model explicitly, and the ego-vehicle should apply complicated negotiation skills with them to achieve both safe and efficient driving in various settings, such as giving way, merging and taking turns. Traditional planning methods are largely rule-based and scale poorly in these complex dynamic scenarios, often leading to reactive or even overly conservative behaviors. Therefore, they require tedious human efforts to maintain workability. Recently, deep learning-based methods have shown promising results with better generalization capability but less hand engineering effort. However, they are either implemented with supervised imitation learning (IL) that suffers from the dataset bias and distribution mismatch problems, or trained with deep reinforcement learning (DRL) but focus on one specific traffic scenario. In this work, we propose DQ-GAT to achieve scalable and proactive autonomous driving, where graph attention-based networks are used to implicitly model interactions, and asynchronous deep Q-learning is employed to train the network end-to-end in an unsupervised manner. Extensive experiments through a high-fidelity driving simulation show that our method can better trade-off safety and efficiency in both seen and unseen scenarios, achieving higher goal success rates than the baselines (at most 4.7$times$) with comparable task completion time. Demonstration videos are available at https://caipeide.github.io/dq-gat/.

Robotics Artificial Intelligence Machine Learning

Safe Planning for Self-Driving Via Adaptive Constrained ILQR

111 - Yanjun Pan , Qin Lin , Het Shah 2020

Constrained Iterative Linear Quadratic Regulator (CILQR), a variant of ILQR, has been recently proposed for motion planning problems of autonomous vehicles to deal with constraints such as obstacle avoidance and reference tracking. However, the previous work considers either deterministic trajectories or persistent prediction for target dynamical obstacles. The other drawback is lack of generality - it requires manual weight tuning for different scenarios. In this paper, two significant improvements are achieved. Firstly, a two-stage uncertainty-aware prediction is proposed. The short-term prediction with safety guarantee based on reachability analysis is responsible for dealing with extreme maneuvers conducted by target vehicles. The long-term prediction leveraging an adaptive least square filter preserves the long-term optimality of the planned trajectory since using reachability only for long-term prediction is too pessimistic and makes the planner over-conservative. Secondly, to allow a wider coverage over different scenarios and to avoid tedious parameter tuning case by case, this paper designs a scenario-based analytical function taking the states from the ego vehicle and the target vehicle as input, and carrying weights of a cost function as output. It allows the ego vehicle to execute multiple behaviors (such as lane-keeping and overtaking) under a single planner. We demonstrate safety, effectiveness, and real-time performance of the proposed planner in simulations.

Robotics Systems and Control Systems and Control

Learning to drive from a world on rails

104 - Dian Chen , Vladlen Koltun , Philipp Krahenbuhl 2021

We learn an interactive vision-based driving policy from pre-recorded driving logs via a model-based approach. A forward model of the world supervises a driving policy that predicts the outcome of any potential driving trajectory. To support learning from pre-recorded logs, we assume that the world is on rails, meaning neither the agent nor its actions influence the environment. This assumption greatly simplifies the learning problem, factorizing the dynamics into a nonreactive world model and a low-dimensional and compact forward model of the ego-vehicle. Our approach computes action-values for each training trajectory using a tabular dynamic-programming evaluation of the Bellman equations; these action-values in turn supervise the final vision-based driving policy. Despite the world-on-rails assumption, the final driving policy acts well in a dynamic and reactive world. At the time of writing, our method ranks first on the CARLA leaderboard, attaining a 25% higher driving score while using 40 times less data. Our method is also an order of magnitude more sample-efficient than state-of-the-art model-free reinforcement learning techniques on navigational tasks in the ProcGen benchmark.

Robotics Computer Vision and Pattern Recognition Machine Learning

Perceive, Predict, and Plan: Safe Motion Planning Through Interpretable Semantic Representations

89 - Abbas Sadat , Sergio Casas , Mengye Ren 2020

In this paper we propose a novel end-to-end learnable network that performs joint perception, prediction and motion planning for self-driving vehicles and produces interpretable intermediate representations. Unlike existing neural motion planners, our motion planning costs are consistent with our perception and prediction estimates. This is achieved by a novel differentiable semantic occupancy representation that is explicitly used as cost by the motion planning process. Our network is learned end-to-end from human demonstrations. The experiments in a large-scale manual-driving dataset and closed-loop simulation show that the proposed model significantly outperforms state-of-the-art planners in imitating the human behaviors while producing much safer trajectories.

Robotics Artificial Intelligence Computer Vision and Pattern Recognition

Self-Supervised Disentangled Representation Learning for Third-Person Imitation Learning

94 - Jinghuan Shang , Michael S. Ryoo 2021

Humans learn to imitate by observing others. However, robot imitation learning generally requires expert demonstrations in the first-person view (FPV). Collecting such FPV videos for every robot could be very expensive. Third-person imitation learning (TPIL) is the concept of learning action policies by observing other agents in a third-person view (TPV), similar to what humans do. This ultimately allows utilizing human and robot demonstration videos in TPV from many different data sources, for the policy learning. In this paper, we present a TPIL approach for robot tasks with egomotion. Although many robot tasks with ground/aerial mobility often involve actions with camera egomotion, study on TPIL for such tasks has been limited. Here, FPV and TPV observations are visually very different; FPV shows egomotion while the agent appearance is only observable in TPV. To enable better state learning for TPIL, we propose our disentangled representation learning method. We use a dual auto-encoder structure plus representation permutation loss and time-contrastive loss to ensure the state and viewpoint representations are well disentangled. Our experiments show the effectiveness of our approach.

Robotics Computer Vision and Pattern Recognition Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Perceive, Attend, and Drive: Learning Spatial Attention for Safe Self-Driving

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions