بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Tensor-Free Second-Order Differential Dynamic Programming

292 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل John Nganga

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف John N. Nganga - Patrick M. Wensing

علم الروبوتات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper presents a method to reduce the computational complexity of including second-order dynamics sensitivity information into the Differential Dynamic Programming (DDP) trajectory optimization algorithm. A tensor-free approach to DDP is developed where all the necessary derivatives are computed with the same complexity as in the iterative Linear Quadratic Regulator~(iLQR). Compared to linearized models used in iLQR, DDP more accurately represents the dynamics locally, but it is not often used since the second-order derivatives of the dynamics are tensorial and expensive to compute. This work shows how to avoid the need for computing the derivative tensor by instead leveraging reverse-mode accumulation of derivative information to compute a key vector-tensor product directly. We benchmark this approach for trajectory optimization with multi-link manipulators and show that the benefits of DDP can often be included without sacrificing evaluation time, and can be done in fewer iterations than iLQR.

قيم البحث

281 - Traiko Dinev , Wolfgang Merkt , Vladimir Ivan 2020

Optimal control is a popular approach to synthesize highly dynamic motion. Commonly, $L_2$ regularization is used on the control inputs in order to minimize energy used and to ensure smoothness of the control inputs. However, for some systems, such a s satellites, the control needs to be applied in sparse bursts due to how the propulsion system operates. In this paper, we study approaches to induce sparsity in optimal control solutions -- namely via smooth $L_1$ and Huber regularization penalties. We apply these loss terms to state-of-the-art DDP-based solvers to create a family of sparsity-inducing optimal control methods. We analyze and compare the effect of the different losses on inducing sparsity, their numerical conditioning, their impact on convergence, and discuss hyperparameter settings. We demonstrate our method in simulation and hardware experiments on canonical dynamics systems, control of satellites, and the NASA Valkyrie humanoid robot. We provide an implementation of our method and all examples for reproducibility on GitHub.

علم الروبوتات

Differential Dynamic Programming for Multi-Phase Rigid Contact Dynamics

379 - Rohan Budhiraja , Justin Carpentier , Carlos Mastalli 2019

A common strategy today to generate efficient locomotion movements is to split the problem into two consecutive steps: the first one generates the contact sequence together with the centroidal trajectory, while the second one computes the whole-body trajectory that follows the centroidal pattern. Yet the second step is generally handled by a simple program such as an inverse kinematics solver. In contrast, we propose to compute the whole-body trajectory by using a local optimal control solver, namely Differential Dynamic Programming (DDP). Our method produces more efficient motions, with lower forces and smaller impacts, by exploiting the Angular Momentum (AM). With this aim, we propose an original DDP formulation exploiting the Karush-Kuhn-Tucker constraint of the rigid contact model. We experimentally show the importance of this approach by executing large steps walking on the real HRP-2 robot, and by solving the problem of attitude control under the absence of external forces.

علم الروبوتات الذكاء الاصطناعي أنظمة وتحكم

Trajectory optimization for contact-rich motions using implicit differential dynamic programming

221 - Iordanis Chatzinikolaidis , Zhibin Li 2021

This paper presents a novel approach using sensitivity analysis for generalizing Differential Dynamic Programming (DDP) to systems characterized by implicit dynamics, such as those modelled via inverse dynamics and variational or implicit integrators . It leads to a more general formulation of DDP, enabling for example the use of the faster recursive Newton-Euler inverse dynamics. We leverage the implicit formulation for precise and exact contact modelling in DDP, where we focus on two contributions: (1) Contact dynamics in acceleration level that enables high-order integration schemes; (2) Formulation using an invertible contact model in the forward pass and a closed form solution in the backward pass to improve the numerical resolution of contacts. The performance of the proposed framework is validated (1) by comparing implicit versus explicit DDP for the swing-up of a double pendulum, and (2) by planning motions for two tasks using a single leg model making multi-body contacts with the environment: standing up from ground, where a priori contact enumeration is challenging, and maintaining balance under an external perturbation.

علم الروبوتات أنظمة وتحكم أنظمة وتحكم

DDPNOpt: Differential Dynamic Programming Neural Optimizer

68 - Guan-Horng Liu , Tianrong Chen , Evangelos A. Theodorou 2020

Interpretation of Deep Neural Networks (DNNs) training as an optimal control problem with nonlinear dynamical systems has received considerable attention recently, yet the algorithmic development remains relatively limited. In this work, we make an a ttempt along this line by reformulating the training procedure from the trajectory optimization perspective. We first show that most widely-used algorithms for training DNNs can be linked to the Differential Dynamic Programming (DDP), a celebrated second-order method rooted in the Approximate Dynamic Programming. In this vein, we propose a new class of optimizer, DDP Neural Optimizer (DDPNOpt), for training feedforward and convolution networks. DDPNOpt features layer-wise feedback policies which improve convergence and reduce sensitivity to hyper-parameter over existing methods. It outperforms other optimal-control inspired training methods in both convergence and complexity, and is competitive against state-of-the-art first and second order methods. We also observe DDPNOpt has surprising benefit in preventing gradient vanishing. Our work opens up new avenues for principled algorithmic design built upon the optimal control theory.

التعلم الآلي الحوسبة العصبية والتطورية التحسين والتحكم

Differential Dynamic Programming for time-delayed systems

57 - David D. Fan , Evangelos A. Theodorou 2017

Trajectory optimization considers the problem of deciding how to control a dynamical system to move along a trajectory which minimizes some cost function. Differential Dynamic Programming (DDP) is an optimal control method which utilizes a second-ord er approximation of the problem to find the control. It is fast enough to allow real-time control and has been shown to work well for trajectory optimization in robotic systems. Here we extend classic DDP to systems with multiple time-delays in the state. Being able to find optimal trajectories for time-delayed systems with DDP opens up the possibility to use richer models for system identification and control, including recurrent neural networks with multiple timesteps in the state. We demonstrate the algorithm on a two-tank continuous stirred tank reactor. We also demonstrate the algorithm on a recurrent neural network trained to model an inverted pendulum with position information only.

أنظمة وتحكم

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة اليرموك الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Tensor-Free Second-Order Differential Dynamic Programming

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً