ترغب بنشر مسار تعليمي؟ اضغط هنا

Centralized Cooperation for Connected and Automated Vehicles at Intersections by Proximal Policy Optimization

335   0   0.0 ( 0 )
 نشر من قبل Yang Guan
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Connected vehicles will change the modes of future transportation management and organization, especially at an intersection without traffic light. Centralized coordination methods globally coordinate vehicles approaching the intersection from all sections by considering their states altogether. However, they need substantial computation resources since they own a centralized controller to optimize the trajectories for all approaching vehicles in real-time. In this paper, we propose a centralized coordination scheme of automated vehicles at an intersection without traffic signals using reinforcement learning (RL) to address low computation efficiency suffered by current centralized coordination methods. We first propose an RL training algorithm, model accelerated proximal policy optimization (MA-PPO), which incorporates a prior model into proximal policy optimization (PPO) algorithm to accelerate the learning process in terms of sample efficiency. Then we present the design of state, action and reward to formulate centralized coordination as an RL problem. Finally, we train a coordinate policy in a simulation setting and compare computing time and traffic efficiency with a coordination scheme based on model predictive control (MPC) method. Results show that our method spends only 1/400 of the computing time of MPC and increase the efficiency of the intersection by 4.5 times.



قيم البحث

اقرأ أيضاً

Connected and automated vehicles have shown great potential in improving traffic mobility and reducing emissions, especially at unsignalized intersections. Previous research has shown that vehicle passing order is the key influencing factor in improv ing intersection traffic mobility. In this paper, we propose a graph-based cooperation method to formalize the conflict-free scheduling problem at an unsignalized intersection. Based on graphical analysis, a vehicles trajectory conflict relationship is modeled as a conflict directed graph and a coexisting undirected graph. Then, two graph-based methods are proposed to find the vehicle passing order. The first is an improved depth-first spanning tree algorithm, which aims to find the local optimal passing order vehicle by vehicle. The other novel method is a minimum clique cover algorithm, which identifies the global optimal solution. Finally, a distributed control framework and communication topology are presented to realize the conflict-free cooperation of vehicles. Extensive numerical simulations are conducted for various numbers of vehicles and traffic volumes, and the simulation results prove the effectiveness of the proposed algorithms.
Non-signalized intersection is a typical and common scenario for connected and automated vehicles (CAVs). How to balance safety and efficiency remains difficult for researchers. To improve the original Responsibility Sensitive Safety (RSS) driving st rategy on the non-signalized intersection, we propose a new strategy in this paper, based on right-of-way assignment (RWA). The performances of RSS strategy, cooperative driving strategy, and RWA based strategy are tested and compared. Testing results indicate that our strategy yields better traffic efficiency than RSS strategy, but not satisfying as the cooperative driving strategy due to the limited range of communication and the lack of long-term planning. However, our new strategy requires much fewer communication costs among vehicles.
In the current level of evolution of Soccer 3D, motion control is a key factor in teams performance. Recent works takes advantages of model-free approaches based on Machine Learning to exploit robot dynamics in order to obtain faster locomotion skill s, achieving running policies and, therefore, opening a new research direction in the Soccer 3D environment. In this work, we present a methodology based on Deep Reinforcement Learning that learns running skills without any prior knowledge, using a neural network whose inputs are related to robots dynamics. Our results outperformed the previous state-of-the-art sprint velocity reported in Soccer 3D literature by a significant margin. It also demonstrated improvement in sample efficiency, being able to learn how to run in just few hours. We reported our results analyzing the training procedure and also evaluating the policies in terms of speed, reliability and human similarity. Finally, we presented key factors that lead us to improve previous results and shared some ideas for future work.
Proving ground, or on-track testing has been an essential part of testing and validation process for connected and autonomous vehicles (CAV). Several world-class CAV proving grounds, such as Mcity at the University of Michigan and The Castle of Waymo , have already been built, and many more are currently under construction. In this paper, we propose the first optimization approach to CAV proving ground designing and refer to any such CAV-centric design problem as Xcity to emphasize the enormous investment, the multi-dimensional spatial consideration, and the immense construction effort emerging globally. Inspired by the recent progress on traffic encounter clustering, we further define road assets as fundamental building blocks and formulate the whole design process into nonlinear optimization problems. We have shown that such framework can be utilized to adaptively generate CAV proving ground designs with optimized capability and flexibility and can further be extended to evaluate an existing Xcity design.
We propose a fully distributed control system architecture, amenable to in-vehicle implementation, that aims to safely coordinate connected and automated vehicles (CAVs) in road intersections. For control purposes, we build upon a fully distributed m odel predictive control approach, in which the agents solve a nonconvex optimal control problem (OCP) locally and synchronously, and exchange their optimized trajectories via vehicle-to-vehicle (V2V) communication. To accommodate a fast solution of the nonconvex OCPs, we apply the penalty convex-concave procedure which aims to solve a convexified version of the original OCP. For experimental evaluation, we complement the predictive controller with a localization layer, being in charge of self-localization and the estimation of joint collision points with other agents. Moreover, we come up with a proprietary communication protocol to exchange trajectories with other agents. Experimental tests reveal the efficacy of proposed control system architecture.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا