Reactive and Safe Road User Simulations using Neural Barrier Certificates

445 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Yue Meng

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Yue Meng - Zengyi Qin - Chuchu Fan

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Reactive and safe agent modelings are important for nowadays traffic simulator designs and safe planning applications. In this work, we proposed a reactive agent model which can ensure safety without comprising the original purposes, by learning only high-level decisions from expert data and a low-level decentralized controller guided by the jointly learned decentralized barrier certificates. Empirical results show that our learned road user simulation models can achieve a significant improvement in safety comparing to state-of-the-art imitation learning and pure control-based methods, while being similar to human agents by having smaller errors to the expert data. Moreover, our learned reactive agents are shown to generalize better to unseen traffic conditions, and react better to other road users and therefore can help understand challenging planning problems pragmatically.

قيم البحث

170 - Zengyi Qin , Kaiqing Zhang , Yuxiao Chen 2021

We study the multi-agent safe control problem where agents should avoid collisions to static obstacles and collisions with each other while reaching their goals. Our core idea is to learn the multi-agent control policy jointly with learning the contr ol barrier functions as safety certificates. We propose a novel joint-learning framework that can be implemented in a decentralized fashion, with generalization guarantees for certain function classes. Such a decentralized framework can adapt to an arbitrarily large number of agents. Building upon this framework, we further improve the scalability by incorporating neural network architectures that are invariant to the quantity and permutation of neighboring agents. In addition, we propose a new spontaneous policy refinement method to further enforce the certificate condition during testing. We provide extensive experiments to demonstrate that our method significantly outperforms other leading multi-agent control approaches in terms of maintaining safety and completing original tasks. Our approach also shows exceptional generalization capability in that the control policy can be trained with 8 agents in one scenario, while being used on other scenarios with up to 1024 agents in complex multi-agent environments and dynamics.

أنظمة متعددة العملاء الذكاء الاصطناعي أنظمة وتحكم

Learning Safe Neural Network Controllers with Barrier Certificates

80 - Hengjun Zhao , Xia Zeng , Taolue Chen 2020

We provide a novel approach to synthesize controllers for nonlinear continuous dynamical systems with control against safety properties. The controllers are based on neural networks (NNs). To certify the safety property we utilize barrier functions, which are represented by NNs as well. We train the controller-NN and barrier-NN simultaneously, achieving a verification-in-the-loop synthesis. We provide a prototype tool nncontroller with a number of case studies. The experiment results confirm the feasibility and efficacy of our approach.

أنظمة وتحكم الذكاء الاصطناعي التعلم الآلي

Learning Barrier Certificates: Towards Safe Reinforcement Learning with Zero Training-time Violations

83 - Yuping Luo , Tengyu Ma 2021

Training-time safety violations have been a major concern when we deploy reinforcement learning algorithms in the real world. This paper explores the possibility of safe RL algorithms with zero training-time safety violations in the challenging setti ng where we are only given a safe but trivial-reward initial policy without any prior knowledge of the dynamics model and additional offline data. We propose an algorithm, Co-trained Barrier Certificate for Safe RL (CRABS), which iteratively learns barrier certificates, dynamics models, and policies. The barrier certificates, learned via adversarial training, ensure the policys safety assuming calibrated learned dynamics model. We also add a regularization term to encourage larger certified regions to enable better exploration. Empirical simulations show that zero safety violations are already challenging for a suite of simple environments with only 2-4 dimensional state space, especially if high-reward policies have to visit regions near the safety boundary. Prior methods require hundreds of violations to achieve decent rewards on these tasks, whereas our proposed algorithms incur zero violations.

التعلم الآلي الذكاء الاصطناعي علم الروبوتات

Safe Nonlinear Control Using Robust Neural Lyapunov-Barrier Functions

189 - Charles Dawson , Zengyi Qin , Sicun Gao 2021

Safety and stability are common requirements for robotic control systems; however, designing safe, stable controllers remains difficult for nonlinear and uncertain models. We develop a model-based learning approach to synthesize robust feedback contr ollers with safety and stability guarantees. We take inspiration from robust convex optimization and Lyapunov theory to define robust control Lyapunov barrier functions that generalize despite model uncertainty. We demonstrate our approach in simulation on problems including car trajectory tracking, nonlinear control with obstacle avoidance, satellite rendezvous with safety constraints, and flight control with a learned ground effect model. Simulation results show that our approach yields controllers that match or exceed the capabilities of robust MPC while reducing computational costs by an order of magnitude.

أنظمة وتحكم علم الروبوتات أنظمة وتحكم

Barrier Certificates Revisited

371 - Liyun Dai , Ting Gan , Bican Xia 2013

A barrier certificate can separate the state space of a con- sidered hybrid system (HS) into safe and unsafe parts ac- cording to the safety property to be verified. Therefore this notion has been widely used in the verification of HSs. A stronger co ndition on barrier certificates means that less expressive barrier certificates can be synthesized. On the other hand, synthesizing more expressive barrier certificates often means high complexity. In [9], Kong et al consid- ered how to relax the condition of barrier certificates while still keeping their convexity so that one can synthesize more expressive barrier certificates efficiently using semi-definite programming (SDP). In this paper, we first discuss how to relax the condition of barrier certificates in a general way, while still keeping their convexity. Particularly, one can then utilize different weaker conditions flexibly to synthesize dif- ferent kinds of barrier certificates with more expressiveness efficiently using SDP. These barriers give more opportuni- ties to verify the considered system. We also show how to combine two functions together to form a combined barrier certificate in order to prove a safety property under consid- eration, whereas neither of them can be used as a barrier certificate separately, even according to any relaxed condi- tion. Another contribution of this paper is that we discuss how to discover certificates from the general relaxed condi- tion by SDP. In particular, we focus on how to avoid the unsoundness because of numeric error caused by SDP with symbolic checking

أنظمة وتحكم