Learning to Control an Unstable System with One Minute of Data: Leveraging Gaussian Process Differentiation in Predictive Control

158 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ivan Dario Jimenez Rodriguez

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ivan D. Jimenez Rodriguez - Ugo Rosolia - Aaron D. Ames

علم الروبوتات التعلم الآلي أنظمة وتحكم

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present a straightforward and efficient way to control unstable robotic systems using an estimated dynamics model. Specifically, we show how to exploit the differentiability of Gaussian Processes to create a state-dependent linearized approximation of the true continuous dynamics that can be integrated with model predictive control. Our approach is compatible with most Gaussian process approaches for system identification, and can learn an accurate model using modest amounts of training data. We validate our approach by learning the dynamics of an unstable system such as a segway with a 7-D state space and 2-D input space (using only one minute of data), and we show that the resulting controller is robust to unmodelled dynamics and disturbances, while state-of-the-art control methods based on nominal models can fail under small perturbations. Code is open sourced at https://github.com/learning-and-control/core .

قيم البحث

277 - Wenjun Liu , Chang Liu , Guang Chen 2021

This paper proposes a novel framework for addressing the challenge of autonomous overtaking and obstacle avoidance, which incorporates the overtaking path planning into Gaussian Process-based model predictive control (GPMPC). Compared with the conven tional control strategies, this approach has two main advantages. Firstly, combining Gaussian Process (GP) regression with a nominal model allows for learning from model mismatch and unmodeled dynamics, which enhances a simple model and delivers significantly better results. Due to the approximation for propagating uncertainties, we can furthermore satisfy the constraints and thereby safety of the vehicle is ensured. Secondly, we convert the geometric relationship between the ego vehicle and other obstacle vehicles into the constraints. Without relying on a higherlevel path planner, this approach substantially reduces the computational burden. In addition, we transform the state constraints under the model predictive control (MPC) framework into a soft constraint and incorporate it as relaxed barrier function into the cost function, which makes the optimizer more efficient. Simulation results reveal the usefulness of the proposed approach.

علم الروبوتات أنظمة وتحكم أنظمة وتحكم

Optimal Cost Design for Model Predictive Control

72 - Avik Jain , Lawrence Chan , Daniel S. Brown 2021

Many robotics domains use some form of nonconvex model predictive control (MPC) for planning, which sets a reduced time horizon, performs trajectory optimization, and replans at every step. The actual task typically requires a much longer horizon tha n is computationally tractable, and is specified via a cost function that cumulates over that full horizon. For instance, an autonomous car may have a cost function that makes a desired trade-off between efficiency, safety, and obeying traffic laws. In this work, we challenge the common assumption that the cost we optimize using MPC should be the same as the ground truth cost for the task (plus a terminal cost). MPC solvers can suffer from short planning horizons, local optima, incorrect dynamics models, and, importantly, fail to account for future replanning ability. Thus, we propose that in many tasks it could be beneficial to purposefully choose a different cost function for MPC to optimize: one that results in the MPC rollout having low ground truth cost, rather than the MPC planned trajectory. We formalize this as an optimal cost design problem, and propose a zeroth-order optimization-based approach that enables us to design optimal costs for an MPC planning robot in continuous MDPs. We test our approach in an autonomous driving domain where we find costs different from the ground truth that implicitly compensate for replanning, short horizon, incorrect dynamics models, and local minima issues. As an example, the learned cost incentivizes MPC to delay its decision until later, implicitly accounting for the fact that it will get more information in the future and be able to make a better decision. Code and videos available at https://sites.google.com/berkeley.edu/ocd-mpc/.

علم الروبوتات التعلم الآلي أنظمة وتحكم

KNODE-MPC: A Knowledge-based Data-driven Predictive Control Framework for Aerial Robots

127 - Kong Yao Chee , Tom Z. Jiahao , M. Ani Hsieh 2021

In this work, we consider the problem of deriving and incorporating accurate dynamic models for model predictive control (MPC) with an application to quadrotor control. MPC relies on precise dynamic models to achieve the desired closed-loop performan ce. However, the presence of uncertainties in complex systems and the environments they operate in poses a challenge in obtaining sufficiently accurate representations of the system dynamics. In this work, we make use of a deep learning tool, knowledge-based neural ordinary differential equations (KNODE), to augment a model obtained from first principles. The resulting hybrid model encompasses both a nominal first-principle model and a neural network learnt from simulated or real-world experimental data. Using a quadrotor, we benchmark our hybrid model against a state-of-the-art Gaussian Process (GP) model and show that the hybrid model provides more accurate predictions of the quadrotor dynamics and is able to generalize beyond the training data. To improve closed-loop performance, the hybrid model is integrated into a novel MPC framework, known as KNODE-MPC. Results show that the integrated framework achieves 73% improvement in simulations and more than 14% in physical experiments, in terms of trajectory tracking performance.

علم الروبوتات التعلم الآلي أنظمة وتحكم

Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning

88 - Lukas Brunke , Melissa Greeff , Adam W. Hall 2021

The last half-decade has seen a steep rise in the number of contributions on safe learning methods for real-world robotic deployments from both the control and reinforcement learning communities. This article provides a concise but holistic review of the recent advances made in using machine learning to achieve safe decision making under uncertainties, with a focus on unifying the language and frameworks used in control theory and reinforcement learning research. Our review includes: learning-based control approaches that safely improve performance by learning the uncertain dynamics, reinforcement learning approaches that encourage safety or robustness, and methods that can formally certify the safety of a learned control policy. As data- and learning-based robot control methods continue to gain traction, researchers must understand when and how to best leverage them in real-world scenarios where safety is imperative, such as when operating in close proximity to humans. We highlight some of the open challenges that will drive the field of robot learning in the coming years, and emphasize the need for realistic physics-based benchmarks to facilitate fair comparisons between control and reinforcement learning approaches.

علم الروبوتات التعلم الآلي أنظمة وتحكم

Actor-Critic Reinforcement Learning for Control with Stability Guarantee

167 - Minghao Han , Lixian Zhang , Jun Wang 2020

Reinforcement Learning (RL) and its integration with deep learning have achieved impressive performance in various robotic control tasks, ranging from motion planning and navigation to end-to-end visual manipulation. However, stability is not guarant eed in model-free RL by solely using data. From a control-theoretic perspective, stability is the most important property for any control system, since it is closely related to safety, robustness, and reliability of robotic systems. In this paper, we propose an actor-critic RL framework for control which can guarantee closed-loop stability by employing the classic Lyapunovs method in control theory. First of all, a data-based stability theorem is proposed for stochastic nonlinear systems modeled by Markov decision process. Then we show that the stability condition could be exploited as the critic in the actor-critic RL to learn a controller/policy. At last, the effectiveness of our approach is evaluated on several well-known 3-dimensional robot control tasks and a synthetic biology gene network tracking task in three different popular physics simulation platforms. As an empirical evaluation on the advantage of stability, we show that the learned policies can enable the systems to recover to the equilibrium or way-points when interfered by uncertainties such as system parametric variations and external disturbances to a certain extent.

علم الروبوتات التعلم الآلي أنظمة وتحكم