ترغب بنشر مسار تعليمي؟ اضغط هنا

With a growing interest in data-driven control techniques, Model Predictive Control (MPC) provides an opportunity to exploit the surplus of data reliably, particularly while taking safety and stability into account. In many real-world and industrial applications, it is typical to have an existing control strategy, for instance, execution from a human operator. The objective of this work is to improve upon this unknown, safe but suboptimal policy by learning a new controller that retains safety and stability. Learning how to be safe is achieved directly from data and from a knowledge of the system constraints. The proposed algorithm alternatively learns the terminal cost and updates the MPC parameters according to a stability metric. The terminal cost is constructed as a Lyapunov function neural network with the aim of recovering or extending the stable region of the initial demonstrator using a short prediction horizon. Theorems that characterize the stability and performance of the learned MPC in the bearing of model uncertainties and sub-optimality due to function approximation are presented. The efficacy of the proposed algorithm is demonstrated on non-linear continuous control tasks with soft constraints. The proposed approach can improve upon the initial demonstrator also in practice and achieve better stability than popular reinforcement learning baselines.
Control applications present hard operational constraints. A violation of these can result in unsafe behavior. This paper introduces Safe Interactive Model Based Learning (SiMBL), a framework to refine an existing controller and a system model while operating on the real environment. SiMBL is composed of the following trainable components: a Lyapunov function, which determines a safe set; a safe control policy; and a Bayesian RNN forward model. A min-max control framework, based on alternate minimisation and backpropagation through the forward model, is used for the offline computation of the controller and the safe set. Safety is formally verified a-posteriori with a probabilistic method that utilizes the Noise Contrastive Priors (NPC) idea to build a Bayesian RNN forward model with an additive state uncertainty estimate which is large outside the training data distribution. Iterative refinement of the model and the safe set is achieved thanks to a novel loss that conditions the uncertainty estimates of the new model to be close to the current one. The learned safe set and model can also be used for safe exploration, i.e., to collect data within the safe invariant set, for which a simple one-step MPC is proposed. The single components are tested on the simulation of an inverted pendulum with limited torque and stability region, showing that iteratively adding more data can improve the model, the controller and the size of the safe region.
The use of recurrent neural networks to represent the dynamics of unstable systems is difficult due to the need to properly initialize their internal states, which in most of the cases do not have any physical meaning, consequent to the non-smoothnes s of the optimization problem. For this reason, in this paper focus is placed on mechanical systems characterized by a number of degrees of freedom, each one represented by two states, namely position and velocity. For these systems, a new recurrent neural network is proposed: Tustin-Net. Inspired by second-order dynamics, the network hidden states can be straightforwardly estimated, as their differential relationships with the measured states are hardcoded in the forward pass. The proposed structure is used to model a double inverted pendulum and for model-based Reinforcement Learning, where an adaptive Model Predictive Controller scheme using the Unscented Kalman Filter is proposed to deal with parameter changes in the system.
This paper proposes the use of spectral element methods citep{canuto_spectral_1988} for fast and accurate training of Neural Ordinary Differential Equations (ODE-Nets; citealp{Chen2018NeuralOD}) for system identification. This is achieved by expressi ng their dynamics as a truncated series of Legendre polynomials. The series coefficients, as well as the network weights, are computed by minimizing the weighted sum of the loss function and the violation of the ODE-Net dynamics. The problem is solved by coordinate descent that alternately minimizes, with respect to the coefficients and the weights, two unconstrained sub-problems using standard backpropagation and gradient methods. The resulting optimization scheme is fully time-parallel and results in a low memory footprint. Experimental comparison to standard methods, such as backpropagation through explicit solvers and the adjoint technique citep{Chen2018NeuralOD}, on training surrogate models of small and medium-scale dynamical systems shows that it is at least one order of magnitude faster at reaching a comparable value of the loss function. The corresponding testing MSE is one order of magnitude smaller as well, suggesting generalization capabilities increase.
This paper introduces Non-Autonomous Input-Output Stable Network(NAIS-Net), a very deep architecture where each stacked processing block is derived from a time-invariant non-autonomous dynamical system. Non-autonomy is implemented by skip connections from the block input to each of the unrolled processing stages and allows stability to be enforced so that blocks can be unrolled adaptively to a pattern-dependent processing depth. NAIS-Net induces non-trivial, Lipschitz input-output maps, even for an infinite unroll length. We prove that the network is globally asymptotically stable so that for every initial condition there is exactly one input-dependent equilibrium assuming $tanh$ units, and incrementally stable for ReL units. An efficient implementation that enforces the stability under derived conditions for both fully-connected and convolutional layers is also presented. Experimental results show how NAIS-Net exhibits stability in practice, yielding a significant reduction in generalization gap compared to ResNets.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا