ترغب بنشر مسار تعليمي؟ اضغط هنا

Verification in the Loop: Correct-by-Construction Control Learning with Reach-avoid Guarantees

188   0   0.0 ( 0 )
 نشر من قبل Yixuan Wang
 تاريخ النشر 2021
والبحث باللغة English




اسأل ChatGPT حول البحث

In the current control design of safety-critical autonomous systems, formal verification techniques are typically applied after the controller is designed to evaluate whether the required properties (e.g., safety) are satisfied. However, due to the increasing system complexity and the fundamental hardness of designing a controller with formal guarantees, such an open-loop process of design-then-verify often results in many iterations and fails to provide the necessary guarantees. In this paper, we propose a correct-by-construction control learning framework that integrates the verification into the control design process in a closed-loop manner, i.e., design-while-verify. Specifically, we leverage the verification results (computed reachable set of the system state) to construct feedback metrics for control learning, which measure how likely the current design of control parameters can meet the required reach-avoid property for safety and goal-reaching. We formulate an optimization problem based on such metrics for tuning the controller parameters, and develop an approximated gradient descent algorithm with a difference method to solve the optimization problem and learn the controller. The learned controller is formally guaranteed to meet the required reach-avoid property. By treating verifiability as a first-class objective and effectively leveraging the verification results during the control learning process, our approach can significantly improve the chance of finding a control design with formal property guarantees. This is demonstrated via a set of experiments on both linear and non-linear systems that use model-based or neural network based controllers.



قيم البحث

اقرأ أيضاً

55 - Zhe Xu 2020
Autonomous systems embedded with machine learning modules often rely on deep neural networks for classifying different objects of interest in the environment or different actions or strategies to take for the system. Due to the non-linearity and high -dimensionality of deep neural networks, the interpretability of the autonomous systems is compromised. Besides, the machine learning methods in autonomous systems are mostly data-intensive and lack commonsense knowledge and reasoning that are natural to humans. In this paper, we propose the framework of temporal logic classifier-in-the-loop systems. The temporal logic classifiers can output different actions to take for an autonomous system based on the environment, such that the behavior of the autonomous system can satisfy a given temporal logic specification. Our approach is robust and provably-correct, as we can prove that the behavior of the autonomous system can satisfy a given temporal logic specification in the presence of (bounded) disturbances.
The combination of machine learning with control offers many opportunities, in particular for robust control. However, due to strong safety and reliability requirements in many real-world applications, providing rigorous statistical and control-theor etic guarantees is of utmost importance, yet difficult to achieve for learning-based control schemes. We present a general framework for learning-enhanced robust control that allows for systematic integration of prior engineering knowledge, is fully compatible with modern robust control and still comes with rigorous and practically meaningful guarantees. Building on the established Linear Fractional Representation and Integral Quadratic Constraints framework, we integrate Gaussian Process Regression as a learning component and state-of-the-art robust controller synthesis. In a concrete robust control example, our approach is demonstrated to yield improved performance with more data, while guarantees are maintained throughout.
The probabilistic reachability problems of nondeterministic systems are studied. Based on the existing studies, the definition of probabilistic reachable sets is generalized by taking into account time-varying target set and obstacle. A numerical met hod is proposed to compute probabilistic reachable sets. First, a scalar function in the state space is constructed by backward recursion and grid interpolation, and then the probability reachable set is represented as a nonzero level set of this scalar function. In addition, based on the constructed scalar function, the optimal control policy can be designed. At the end of this paper, some examples are taken to illustrate the validity and accuracy of the proposed method.
We propose a new framework to solve online optimization and learning problems in unknown and uncertain dynamical environments. This framework enables us to simultaneously learn the uncertain dynamical environment while making online decisions in a qu antifiably robust manner. The main technical approach relies on the theory of distributional robust optimization that leverages adaptive probabilistic ambiguity sets. However, as defined, the ambiguity set usually leads to online intractable problems, and the first part of our work is directed to find reformulations in the form of online convex problems for two sub-classes of objective functions. To solve the resulting problems in the proposed framework, we further introduce an online version of the Nesterov accelerated-gradient algorithm. We determine how the proposed solution system achieves a probabilistic regret bound under certain conditions. Two applications illustrate the applicability of the proposed framework.
We propose Kernel Predictive Control (KPC), a learning-based predictive control strategy that enjoys deterministic guarantees of safety. Noise-corrupted samples of the unknown system dynamics are used to learn several models through the formalism of non-parametric kernel regression. By treating each prediction step individually, we dispense with the need of propagating sets through highly non-linear maps, a procedure that often involves multiple conservative approximation steps. Finite-sample error bounds are then used to enforce state-feasibility by employing an efficient robust formulation. We then present a relaxation strategy that exploits on-line data to weaken the optimization problem constraints while preserving safety. Two numerical examples are provided to illustrate the applicability of the proposed control method.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا