Preference-Based Learning for User-Guided HZD Gait Generation on Bipedal Walking Robots

183 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Maegan Tucker

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Maegan Tucker - Noel Csomay-Shanklin - Wen-Loong Ma

علم الروبوتات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper presents a framework that leverages both control theory and machine learning to obtain stable and robust bipedal locomotion without the need for manual parameter tuning. Traditionally, gaits are generated through trajectory optimization methods and then realized experimentally -- a process that often requires extensive tuning due to differences between the models and hardware. In this work, the process of gait realization via hybrid zero dynamics (HZD) based optimization is formally combined with preference-based learning to systematically realize dynamically stable walking. Importantly, this learning approach does not require a carefully constructed reward function, but instead utilizes human pairwise preferences. The power of the proposed approach is demonstrated through two experiments on a planar biped AMBER-3M: the first with rigid point-feet, and the second with induced model uncertainty through the addition of springs where the added compliance was not accounted for in the gait generation or in the controller. In both experiments, the framework achieves stable, robust, efficient, and natural walking in fewer than 50 iterations with no reliance on a simulation environment. These results demonstrate a promising step in the unification of control theory and learning.

قيم البحث

108 - Noel Csomay-Shanklin , Maegan Tucker , Min Dai 2021

Experimental demonstration of complex robotic behaviors relies heavily on finding the correct controller gains. This painstaking process is often completed by a domain expert, requiring deep knowledge of the relationship between parameter values and the resulting behavior of the system. Even when such knowledge is possessed, it can take significant effort to navigate the nonintuitive landscape of possible parameter combinations. In this work, we explore the extent to which preference-based learning can be used to optimize controller gains online by repeatedly querying the user for their preferences. This general methodology is applied to two variants of control Lyapunov function based nonlinear controllers framed as quadratic programs, which have nice theoretic properties but are challenging to realize in practice. These controllers are successfully demonstrated both on the planar underactuated biped, AMBER, and on the 3D underactuated biped, Cassie. We experimentally evaluate the performance of the learned controllers and show that the proposed method is repeatably able to learn gains that yield stable and robust locomotion.

علم الروبوتات

Preference-Based Learning for Exoskeleton Gait Optimization

73 - Maegan Tucker , Ellen Novoseller , Claudia Kann 2019

This paper presents a personalized gait optimization framework for lower-body exoskeletons. Rather than optimizing numerical objectives such as the mechanical cost of transport, our approach directly learns from user preferences, e.g., for comfort. B uilding upon work in preference-based interactive learning, we present the CoSpar algorithm. CoSpar prompts the user to give pairwise preferences between trials and suggest improvements; as exoskeleton walking is a non-intuitive behavior, users can provide preferences more easily and reliably than numerical feedback. We show that CoSpar performs competitively in simulation and demonstrate a prototype implementation of CoSpar on a lower-body exoskeleton to optimize human walking trajectory features. In the experiments, CoSpar consistently found user-preferred parameters of the exoskeletons walking gait, which suggests that it is a promising starting point for adapting and personalizing exoskeletons (or other assistive devices) to individual users.

علم الروبوتات

Human Preference-Based Learning for High-dimensional Optimization of Exoskeleton Walking Gaits

157 - Maegan Tucker , Myra Cheng , Ellen Novoseller 2020

Optimizing lower-body exoskeleton walking gaits for user comfort requires understanding users preferences over a high-dimensional gait parameter space. However, existing preference-based learning methods have only explored low-dimensional domains due to computational limitations. To learn user preferences in high dimensions, this work presents LineCoSpar, a human-in-the-loop preference-based framework that enables optimization over many parameters by iteratively exploring one-dimensional subspaces. Additionally, this work identifies gait attributes that characterize broader preferences across users. In simulations and human trials, we empirically verify that LineCoSpar is a sample-efficient approach for high-dimensional preference optimization. Our analysis of the experimental data reveals a correspondence between human preferences and objective measures of dynamicity, while also highlighting differences in the utility functions underlying individual users gait preferences. This result has implications for exoskeleton gait synthesis, an active field with applications to clinical use and patient rehabilitation.

علم الروبوتات تفاعل الإنسان والحاسوب التعلم الآلي

Invariant Filtering for Bipedal Walking on Dynamic Rigid Surfaces with Orientation-based Measurement Model

249 - Yuan Gao , Yan Gu 2021

Real-world applications of bipedal robot walking require accurate, real-time state estimation. State estimation for locomotion over dynamic rigid surfaces (DRS), such as elevators, ships, public transport vehicles, and aircraft, remains under-explore d, although state estimator designs for stationary rigid surfaces have been extensively studied. Addressing DRS locomotion in state estimation is a challenging problem mainly due to the nonlinear, hybrid nature of walking dynamics, the nonstationary surface-foot contact points, and hardware imperfections (e.g., limited availability, noise, and drift of onboard sensors). Towards solving this problem, we introduce an Invariant Extended Kalman Filter (InEKF) whose process and measurement models explicitly consider the DRS movement and hybrid walking behaviors while respectively satisfying the group-affine condition and invariant form. Due to these attractive properties, the estimation error convergence of the filter is provably guaranteed for hybrid DRS locomotion. The measurement model of the filter also exploits the holonomic constraint associated with the support-foot and surface orientations, under which the robots yaw angle in the world becomes observable in the presence of general DRS movement. Experimental results of bipedal walking on a rocking treadmill demonstrate the proposed filter ensures the rapid error convergence and observable base yaw angle.

علم الروبوتات

Reinforcement Learning for Robust Parameterized Locomotion Control of Bipedal Robots

118 - Zhongyu Li , Xuxin Cheng , Xue Bin Peng 2021

Developing robust walking controllers for bipedal robots is a challenging endeavor. Traditional model-based locomotion controllers require simplifying assumptions and careful modelling; any small errors can result in unstable control. To address thes e challenges for bipedal locomotion, we present a model-free reinforcement learning framework for training robust locomotion policies in simulation, which can then be transferred to a real bipedal Cassie robot. To facilitate sim-to-real transfer, domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics. The learned policies enable Cassie to perform a set of diverse and dynamic behaviors, while also being more robust than traditional controllers and prior learning-based methods that use residual control. We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw.

علم الروبوتات الذكاء الاصطناعي التعلم الآلي