The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems

73 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Spencer M. Richards

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Spencer M. Richards - Felix Berkenkamp - Andreas Krause

أنظمة وتحكم التعلم الآلي علم الروبوتات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Learning algorithms have shown considerable prowess in simulation by allowing robots to adapt to uncertain environments and improve their performance. However, such algorithms are rarely used in practice on safety-critical systems, since the learned policy typically does not yield any safety guarantees. That is, the required exploration may cause physical harm to the robot or its environment. In this paper, we present a method to learn accurate safety certificates for nonlinear, closed-loop dynamical systems. Specifically, we construct a neural network Lyapunov function and a training algorithm that adapts it to the shape of the largest safe region in the state space. The algorithm relies only on knowledge of inputs and outputs of the dynamics, rather than on any specific model structure. We demonstrate our method by learning the safe region of attraction for a simulated inverted pendulum. Furthermore, we discuss how our method can be used in safe learning algorithms together with statistical models of dynamical systems.

قيم البحث

114 - Shaoru Chen , Mahyar Fazlyab , Manfred Morari 2020

We propose a learning-based method for Lyapunov stability analysis of piecewise affine dynamical systems in feedback with piecewise affine neural network controllers. The proposed method consists of an iterative interaction between a learner and a ve rifier, where in each iteration, the learner uses a collection of samples of the closed-loop system to propose a Lyapunov function candidate as the solution to a convex program. The learner then queries the verifier, which solves a mixed-integer program to either validate the proposed Lyapunov function candidate or reject it with a counterexample, i.e., a state where the stability condition fails. This counterexample is then added to the sample set of the learner to refine the set of Lyapunov function candidates. We design the learner and the verifier based on the analytic center cutting-plane method, in which the verifier acts as the cutting-plane oracle to refine the set of Lyapunov function candidates. We show that when the set of Lyapunov functions is full-dimensional in the parameter space, the overall procedure finds a Lyapunov function in a finite number of iterations. We demonstrate the utility of the proposed method in searching for quadratic and piecewise quadratic Lyapunov functions.

التحسين والتحكم

Data Generation Method for Learning a Low-dimensional Safe Region in Safe Reinforcement Learning

94 - Zhehua Zhou , Ozgur S. Oguz , Yi Ren 2021

Safe reinforcement learning aims to learn a control policy while ensuring that neither the system nor the environment gets damaged during the learning process. For implementing safe reinforcement learning on highly nonlinear and high-dimensional dyna mical systems, one possible approach is to find a low-dimensional safe region via data-driven feature extraction methods, which provides safety estimates to the learning algorithm. As the reliability of the learned safety estimates is data-dependent, we investigate in this work how different training data will affect the safe reinforcement learning approach. By balancing between the learning performance and the risk of being unsafe, a data generation method that combines two sampling methods is proposed to generate representative training data. The performance of the method is demonstrated with a three-link inverted pendulum example.

أنظمة وتحكم التعلم الآلي علم الروبوتات

Bayesian Learning-Based Adaptive Control for Safety Critical Systems

70 - David D. Fan , Jennifer Nguyen , Rohan Thakker 2019

Deep learning has enjoyed much recent success, and applying state-of-the-art model learning methods to controls is an exciting prospect. However, there is a strong reluctance to use these methods on safety-critical systems, which have constraints on safety, stability, and real-time performance. We propose a framework which satisfies these constraints while allowing the use of deep neural networks for learning model uncertainties. Central to our method is the use of Bayesian model learning, which provides an avenue for maintaining appropriate degrees of caution in the face of the unknown. In the proposed approach, we develop an adaptive control framework leveraging the theory of stochastic CLFs (Control Lyapunov Functions) and stochastic CBFs (Control Barrier Functions) along with tractable Bayesian model learning via Gaussian Processes or Bayesian neural networks. Under reasonable assumptions, we guarantee stability and safety while adapting to unknown dynamics with probability 1. We demonstrate this architecture for high-speed terrestrial mobility targeting potential applications in safety-critical high-speed Mars rover missions.

أنظمة وتحكم التعلم الآلي علم الروبوتات

Recurrent Neural Network Controllers Synthesis with Stability Guarantees for Partially Observed Systems

188 - Fangda Gu , He Yin , Laurent El Ghaoui 2021

Neural network controllers have become popular in control tasks thanks to their flexibility and expressivity. Stability is a crucial property for safety-critical dynamical systems, while stabilization of partially observed systems, in many cases, req uires controllers to retain and process long-term memories of the past. We consider the important class of recurrent neural networks (RNN) as dynamic controllers for nonlinear uncertain partially-observed systems, and derive convex stability conditions based on integral quadratic constraints, S-lemma and sequential convexification. To ensure stability during the learning and control process, we propose a projected policy gradient method that iteratively enforces the stability conditions in the reparametrized space taking advantage of mild additional information on system dynamics. Numerical experiments show that our method learns stabilizing controllers while using fewer samples and achieving higher final performance compared with policy gradient.

أنظمة وتحكم الذكاء الاصطناعي علم الروبوتات

SASL: Saliency-Adaptive Sparsity Learning for Neural Network Acceleration

247 - Jun Shi , Jianfeng Xu , Kazuyuki Tasaka 2020

Accelerating the inference speed of CNNs is critical to their deployment in real-world applications. Among all the pruning approaches, those implementing a sparsity learning framework have shown to be effective as they learn and prune the models in a n end-to-end data-driven manner. However, these works impose the same sparsity regularization on all filters indiscriminately, which can hardly result in an optimal structure-sparse network. In this paper, we propose a Saliency-Adaptive Sparsity Learning (SASL) approach for further optimization. A novel and effective estimation of each filter, i.e., saliency, is designed, which is measured from two aspects: the importance for the prediction performance and the consumed computational resources. During sparsity learning, the regularization strength is adjusted according to the saliency, so our optimized format can better preserve the prediction performance while zeroing out more computation-heavy filters. The calculation for saliency introduces minimum overhead to the training process, which means our SASL is very efficient. During the pruning phase, in order to optimize the proposed data-dependent criterion, a hard sample mining strategy is utilized, which shows higher effectiveness and efficiency. Extensive experiments demonstrate the superior performance of our method. Notably, on ILSVRC-2012 dataset, our approach can reduce 49.7% FLOPs of ResNet-50 with very negligible 0.39% top-1 and 0.05% top-5 accuracy degradation.

الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي