ترغب بنشر مسار تعليمي؟ اضغط هنا

Control Barriers in Bayesian Learning of System Dynamics

206   0   0.0 ( 0 )
 نشر من قبل Vikas Dhiman
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper focuses on learning a model of system dynamics online while satisfying safety constraints. Our objective is to avoid offline system identification or hand-specified models and allow a system to safely and autonomously estimate and adapt its own model during operation. Given streaming observations of the system state, we use Bayesian learning to obtain a distribution over the system dynamics. Specifically, we propose a new matrix variate Gaussian process (MVGP) regression approach with an efficient covariance factorization to learn the drift and input gain terms of a nonlinear control-affine system. The MVGP distribution is then used to optimize the system behavior and ensure safety with high probability, by specifying control Lyapunov function (CLF) and control barrier function (CBF) chance constraints. We show that a safe control policy can be synthesized for systems with arbitrary relative degree and probabilistic CLF-CBF constraints by solving a second order cone program (SOCP). Finally, we extend our design to a self-triggering formulation, adaptively determining the time at which a new control input needs to be applied in order to guarantee safety.



قيم البحث

اقرأ أيضاً

This paper proposes a sparse Bayesian treatment of deep neural networks (DNNs) for system identification. Although DNNs show impressive approximation ability in various fields, several challenges still exist for system identification problems. First, DNNs are known to be too complex that they can easily overfit the training data. Second, the selection of the input regressors for system identification is nontrivial. Third, uncertainty quantification of the model parameters and predictions are necessary. The proposed Bayesian approach offers a principled way to alleviate the above challenges by marginal likelihood/model evidence approximation and structured group sparsity-inducing priors construction. The identification algorithm is derived as an iterative regularized optimization procedure that can be solved as efficiently as training typical DNNs. Furthermore, a practical calculation approach based on the Monte-Carlo integration method is derived to quantify the uncertainty of the parameters and predictions. The effectiveness of the proposed Bayesian approach is demonstrated on several linear and nonlinear systems identification benchmarks with achieving good and competitive simulation accuracy.
We apply the meta reinforcement learning framework to optimize an integrated and adaptive guidance and flight control system for an air-to-air missile, implementing the system as a deep neural network (the policy). The policy maps observations direct ly to commanded rates of change for the missiles control surface deflections, with the observations derived with minimal processing from the computationally stabilized line of sight unit vector measured by a strap down seeker, estimated rotational velocity from rate gyros, and control surface deflection angles. The system induces intercept trajectories against a maneuvering target that satisfy control constraints on fin deflection angles, and path constraints on look angle and load. We test the optimized system in a six degrees-of-freedom simulator that includes a non-linear radome model and a strapdown seeker model. Through extensive simulation, we demonstrate that the system can adapt to a large flight envelope and off nominal flight conditions that include perturbation of aerodynamic coefficient parameters and center of pressure locations. Moreover, we find that the system is robust to the parasitic attitude loop induced by radome refraction, imperfect seeker stabilization, and sensor scale factor errors. Finally, we compare our systems performance to two benchmarks: a proportional navigation guidance system benchmark in a simplified 3-DOF environment, which we take as an upper bound on performance attainable with separate guidance and flight control systems, and a longitudinal model of proportional navigation coupled with a three loop autopilot. We find that our system moderately outperforms the former, and outperforms the latter by a large margin.
Many state estimation algorithms must be tuned given the state space process and observation models, the process and observation noise parameters must be chosen. Conventional tuning approaches rely on heuristic hand-tuning or gradient-based optimizat ion techniques to minimize a performance cost function. However, the relationship between tuned noise values and estimator performance is highly nonlinear and stochastic. Therefore, the tuning solutions can easily get trapped in local minima, which can lead to poor choices of noise parameters and suboptimal estimator performance. This paper describes how Bayesian Optimization (BO) can overcome these issues. BO poses optimization as a Bayesian search problem for a stochastic ``black box cost function, where the goal is to search the solution space to maximize the probability of improving the current best solution. As such, BO offers a principled approach to optimization-based estimator tuning in the presence of local minima and performance stochasticity. While extended Kalman filters (EKFs) are the main focus of this work, BO can be similarly used to tune other related state space filters. The method presented here uses performance metrics derived from normalized innovation squared (NIS) filter residuals obtained via sensor data, which renders knowledge of ground-truth states unnecessary. The robustness, accuracy, and reliability of BO-based tuning is illustrated on practical nonlinear state estimation problems,losed-loop aero-robotic control.
This paper entails application of the energy shaping methodology to control a flexible, elastic Cosserat rod model. Recent interest in such continuum models stems from applications in soft robotics, and from the growing recognition of the role of mec hanics and embodiment in biological control strategies: octopuses are often regarded as iconic examples of this interplay. Here, the dynamics of the Cosserat rod, modeling a single octopus arm, are treated as a Hamiltonian system and the internal muscle actuators are modeled as distributed forces and couples. The proposed energy shaping control design procedure involves two steps: (1) a potential energy is designed such that its minimizer is the desired equilibrium configuration; (2) an energy shaping control law is implemented to reach the desired equilibrium. By interpreting the controlled Hamiltonian as a Lyapunov function, asymptotic stability of the equilibrium configuration is deduced. The energy shaping control law is shown to require only the deformations of the equilibrium configuration. A forward-backward algorithm is proposed to compute these deformations in an online iterative manner. The overall control design methodology is implemented and demonstrated in a dynamic simulation environment. Results of several bio-inspired numerical experiments involving the control of octopus arms are reported.
We consider a discrete-time linear-quadratic Gaussian control problem in which we minimize a weighted sum of the directed information from the state of the system to the control input and the control cost. The optimal control and sensing policies can be synthesized jointly by solving a semidefinite programming problem. However, the existing solutions typically scale cubic with the horizon length. We leverage the structure in the problem to develop a distributed algorithm that decomposes the synthesis problem into a set of smaller problems, one for each time step. We prove that the algorithm runs in time linear in the horizon length. As an application of the algorithm, we consider a path-planning problem in a state space with obstacles under the presence of stochastic disturbances. The algorithm computes a locally optimal solution that jointly minimizes the perception and control cost while ensuring the safety of the path. The numerical examples show that the algorithm can scale to thousands of horizon length and compute locally optimal solutions.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا