Meta Reinforcement Learning-Based Lane Change Strategy for Autonomous Vehicles

115 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Fei Ye

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Fei Ye - Pin Wang - Ching-Yao Chan

التعلم الآلي علم الروبوتات

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Recent advances in supervised learning and reinforcement learning have provided new opportunities to apply related methodologies to automated driving. However, there are still challenges to achieve automated driving maneuvers in dynamically changing environments. Supervised learning algorithms such as imitation learning can generalize to new environments by training on a large amount of labeled data, however, it can be often impractical or cost-prohibitive to obtain sufficient data for each new environment. Although reinforcement learning methods can mitigate this data-dependency issue by training the agent in a trial-and-error way, they still need to re-train policies from scratch when adapting to new environments. In this paper, we thus propose a meta reinforcement learning (MRL) method to improve the agents generalization capabilities to make automated lane-changing maneuvers at different traffic environments, which are formulated as different traffic congestion levels. Specifically, we train the model at light to moderate traffic densities and test it at a new heavy traffic density condition. We use both collision rate and success rate to quantify the safety and effectiveness of the proposed model. A benchmark model is developed based on a pretraining method, which uses the same network structure and training tasks as our proposed model for fair comparison. The simulation results shows that the proposed method achieves an overall success rate up to 20% higher than the benchmark model when it is generalized to the new environment of heavy traffic density. The collision rate is also reduced by up to 18% than the benchmark model. Finally, the proposed model shows more stable and efficient generalization capabilities adapting to the new environment, and it can achieve 100% successful rate and 0% collision rate with only a few steps of gradient updates.

قيم البحث

113 - Fei Ye , Xuxin Cheng , Pin Wang 2020

Lane-change maneuvers are commonly executed by drivers to follow a certain routing plan, overtake a slower vehicle, adapt to a merging lane ahead, etc. However, improper lane change behaviors can be a major cause of traffic flow disruptions and even crashes. While many rule-based methods have been proposed to solve lane change problems for autonomous driving, they tend to exhibit limited performance due to the uncertainty and complexity of the driving environment. Machine learning-based methods offer an alternative approach, as Deep reinforcement learning (DRL) has shown promising success in many application domains including robotic manipulation, navigation, and playing video games. However, applying DRL to autonomous driving still faces many practical challenges in terms of slow learning rates, sample inefficiency, and safety concerns. In this study, we propose an automated lane change strategy using proximal policy optimization-based deep reinforcement learning, which shows great advantages in learning efficiency while still maintaining stable performance. The trained agent is able to learn a smooth, safe, and efficient driving policy to make lane-change decisions (i.e. when and how) in a challenging situation such as dense traffic scenarios. The effectiveness of the proposed policy is validated by using metrics of task success rate and collision rate. The simulation results demonstrate the lane change maneuvers can be efficiently learned and executed in a safe, smooth, and efficient manner.

التعلم الآلي الذكاء الاصطناعي علم الروبوتات

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

432 - Zhuo Xu , Chen Tang , Masayoshi Tomizuka 2018

Although deep reinforcement learning (deep RL) methods have lots of strengths that are favorable if applied to autonomous driving, real deep RL applications in autonomous driving have been slowed down by the modeling gap between the source (training) domain and the target (deployment) domain. Unlike current policy transfer approaches, which generally limit to the usage of uninterpretable neural network representations as the transferred features, we propose to transfer concrete kinematic quantities in autonomous driving. The proposed robust-control-based (RC) generic transfer architecture, which we call RL-RC, incorporates a transferable hierarchical RL trajectory planner and a robust tracking controller based on disturbance observer (DOB). The deep RL policies trained with known nominal dynamics model are transfered directly to the target domain, DOB-based robust tracking control is applied to tackle the modeling gap including the vehicle dynamics errors and the external disturbances such as side forces. We provide simulations validating the capability of the proposed method to achieve zero-shot transfer across multiple driving scenarios such as lane keeping, lane changing and obstacle avoidance.

التعلم الآلي علم الروبوتات أنظمة وتحكم

Development of Simulation-based Lane Change Control System for Autonomous Vehicles

114 - Seongjin Choi 2021

Originally, the decision and control of the lane change of the vehicle were on the human driver. In previous studies, the decision-making of lane-changing of the human drivers was mainly used to increase the individuals benefit. However, the lane-cha nging behavior of these human drivers can sometimes have a bad influence on the overall traffic flow. As technology for autonomous vehicles develop, lane changing action as well as lane changing decision making fall within the control category of autonomous vehicles. However, since many of the current lane-changing decision algorithms of autonomous vehicles are based on the human driver model, it is hard to know the potential traffic impact of such lane change. Therefore, in this study, we focused on the decision-making of lane change considering traffic flow, and accordingly, we study the lane change control system considering the whole traffic flow. In this research, the lane change control system predicts the future traffic situation through the cell transition model, one of the most popular macroscopic traffic simulation models, and determines the change probability of each lane that minimizes the total time delay through the genetic algorithm. The lane change control system then conveys the change probability to this vehicle. In the macroscopic simulation result, the proposed control system reduced the overall travel time delay. The proposed system is applied to microscopic traffic simulation, the oversaturated freeway traffic flow algorithm (OFFA), to evaluate the potential performance when it is applied to the actual traffic system. In the traffic flow-density, the maximum traffic flow has been shown to be increased, and the points in the congestion area have also been greatly reduced. Overall, the time required for individual vehicles was reduced.

أنظمة وتحكم أنظمة وتحكم

Transferable Deep Reinforcement Learning Framework for Autonomous Vehicles with Joint Radar-Data Communications

70 - Nguyen Quang Hieu , Dinh Thai Hoang , Dusit Niyato 2021

Autonomous Vehicles (AVs) are required to operate safely and efficiently in dynamic environments. For this, the AVs equipped with Joint Radar-Communications (JRC) functions can enhance the driving safety by utilizing both radar detection and data com munication functions. However, optimizing the performance of the AV system with two different functions under uncertainty and dynamic of surrounding environments is very challenging. In this work, we first propose an intelligent optimization framework based on the Markov Decision Process (MDP) to help the AV make optimal decisions in selecting JRC operation functions under the dynamic and uncertainty of the surrounding environment. We then develop an effective learning algorithm leveraging recent advances of deep reinforcement learning techniques to find the optimal policy for the AV without requiring any prior information about surrounding environment. Furthermore, to make our proposed framework more scalable, we develop a Transfer Learning (TL) mechanism that enables the AV to leverage valuable experiences for accelerating the training process when it moves to a new environment. Extensive simulations show that the proposed transferable deep reinforcement learning framework reduces the obstacle miss detection probability by the AV up to 67% compared to other conventional deep reinforcement learning approaches.

التعلم الآلي علم الروبوتات

Provably Safe Model-Based Meta Reinforcement Learning: An Abstraction-Based Approach

138 - Xiaowu Sun , Wael Fatnassi , Ulices Santa Cruz 2021

While conventional reinforcement learning focuses on designing agents that can perform one task, meta-learning aims, instead, to solve the problem of designing agents that can generalize to different tasks (e.g., environments, obstacles, and goals) t hat were not considered during the design or the training of these agents. In this spirit, in this paper, we consider the problem of training a provably safe Neural Network (NN) controller for uncertain nonlinear dynamical systems that can generalize to new tasks that were not present in the training data while preserving strong safety guarantees. Our approach is to learn a set of NN controllers during the training phase. When the task becomes available at runtime, our framework will carefully select a subset of these NN controllers and compose them to form the final NN controller. Critical to our approach is the ability to compute a finite-state abstraction of the nonlinear dynamical system. This abstract model captures the behavior of the closed-loop system under all possible NN weights, and is used to train the NNs and compose them when the task becomes available. We provide theoretical guarantees that govern the correctness of the resulting NN. We evaluated our approach on the problem of controlling a wheeled robot in cluttered environments that were not present in the training data.

التعلم الآلي علم الروبوتات أنظمة وتحكم