ترغب بنشر مسار تعليمي؟ اضغط هنا

Case Study: Verifying the Safety of an Autonomous Racing Car with a Neural Network Controller

342   0   0.0 ( 0 )
 نشر من قبل Radoslav Ivanov
 تاريخ النشر 2019
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper describes a verification case study on an autonomous racing car with a neural network (NN) controller. Although several verification approaches have been proposed over the last year, they have only been evaluated on low-dimensional systems or systems with constrained environments. To explore the limits of existing approaches, we present a challenging benchmark in which the NN takes raw LiDAR measurements as input and outputs steering for the car. We train a dozen NNs using two reinforcement learning algorithms and show that the state of the art in verification can handle systems with around 40 LiDAR rays, well short of a typical LiDAR scan with 1081 rays. Furthermore, we perform real experiments to investigate the benefits and limitations of verification with respect to the sim2real gap, i.e., the difference between a systems modeled and real performance. We identify cases, similar to the modeled environment, in which verification is strongly correlated with safe behavior. Finally, we illustrate LiDAR fault patterns that can be used to develop robust and safe reinforcement learning algorithms.



قيم البحث

اقرأ أيضاً

In this paper, we present a safe deep reinforcement learning system for automated driving. The proposed framework leverages merits of both rule-based and learning-based approaches for safety assurance. Our safety system consists of two modules namely handcrafted safety and dynamically-learned safety. The handcrafted safety module is a heuristic safety rule based on common driving practice that ensure a minimum relative gap to a traffic vehicle. On the other hand, the dynamically-learned safety module is a data-driven safety rule that learns safety patterns from driving data. Specifically, the dynamically-leaned safety module incorporates a model lookahead beyond the immediate reward of reinforcement learning to predict safety longer into the future. If one of the future states leads to a near-miss or collision, then a negative reward will be assigned to the reward function to avoid collision and accelerate the learning process. We demonstrate the capability of the proposed framework in a simulation environment with varying traffic density. Our results show the superior capabilities of the policy enhanced with dynamically-learned safety module.
Complementarity problems, a class of mathematical optimization problems with orthogonality constraints, are widely used in many robotics tasks, such as locomotion and manipulation, due to their ability to model non-smooth phenomena (e.g., contact dyn amics). In this paper, we propose a method to analyze the stability of complementarity systems with neural network controllers. First, we introduce a method to represent neural networks with rectified linear unit (ReLU) activations as the solution to a linear complementarity problem. Then, we show that systems with ReLU network controllers have an equivalent linear complementarity system (LCS) description. Using the LCS representation, we turn the stability verification problem into a linear matrix inequality (LMI) feasibility problem. We demonstrate the approach on several examples, including multi-contact problems and friction models with non-unique solutions.
We have recently proposed two pile loading controllers that learn from human demonstrations: a neural network (NNet) [1] and a random forest (RF) controller [2]. In the field experiments the RF controller obtained clearly better success rates. In thi s work, the previous findings are drastically revised by experimenting summer time trained controllers in winter conditions. The winter experiments revealed a need for additional sensors, more training data, and a controller that can take advantage of these. Therefore, we propose a revised neural controller (NNetV2) which has a more expressive structure and uses a neural attention mechanism to focus on important parts of the sensor and control signals. Using the same data and sensors to train and test the three controllers, NNetV2 achieves better robustness against drastically changing conditions and superior success rate. To the best of our knowledge, this is the first work testing a learning-based controller for a heavy-duty machine in drastically varying outdoor conditions and delivering high success rate in winter, being trained in summer.
Autonomous car racing is a challenging task in the robotic control area. Traditional modular methods require accurate mapping, localization and planning, which makes them computationally inefficient and sensitive to environmental changes. Recently, d eep-learning-based end-to-end systems have shown promising results for autonomous driving/racing. However, they are commonly implemented by supervised imitation learning (IL), which suffers from the distribution mismatch problem, or by reinforcement learning (RL), which requires a huge amount of risky interaction data. In this work, we present a general deep imitative reinforcement learning approach (DIRL), which successfully achieves agile autonomous racing using visual inputs. The driving knowledge is acquired from both IL and model-based RL, where the agent can learn from human teachers as well as perform self-improvement by safely interacting with an offline world model. We validate our algorithm both in a high-fidelity driving simulation and on a real-world 1/20-scale RC-car with limited onboard computation. The evaluation results demonstrate that our method outperforms previous IL and RL methods in terms of sample efficiency and task performance. Demonstration videos are available at https://caipeide.github.io/autorace-dirl/
100 - Zhe Xu , Yichen Zhang 2021
In this paper, we present a provably correct controller synthesis approach for switched stochastic control systems with metric temporal logic (MTL) specifications with provable probabilistic guarantees. We first present the stochastic control bisimul ation function for switched stochastic control systems, which bounds the trajectory divergence between the switched stochastic control system and its nominal deterministic control system in a probabilistic fashion. We then develop a method to compute optimal control inputs by solving an optimization problem for the nominal trajectory of the deterministic control system with robustness against initial state variations and stochastic uncertainties. We implement our robust stochastic controller synthesis approach on both a four-bus power system and a nine-bus power system under generation loss disturbances, with MTL specifications expressing requirements for the grid frequency deviations, wind turbine generator rotor speed variations and the power flow constraints at different power lines.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا