Learning Quadruped Locomotion Policies with Reward Machines

96 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل David DeFazio

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف David DeFazio - Shiqi Zhang

علم الروبوتات الذكاء الاصطناعي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Legged robots have been shown to be effective in navigating unstructured environments. Although there has been much success in learning locomotion policies for quadruped robots, there is little research on how to incorporate human knowledge to facilitate this learning process. In this paper, we demonstrate that human knowledge in the form of LTL formulas can be applied to quadruped locomotion learning within a Reward Machine (RM) framework. Experimental results in simulation show that our RM-based approach enables easily defining diverse locomotion styles, and efficiently learning locomotion policies of the defined styles.

قيم البحث

253 - Jie Tan , Tingnan Zhang , Erwin Coumans 2018

Designing agile locomotion for quadruped robots often requires extensive expertise and tedious manual tuning. In this paper, we present a system to automate this process by leveraging deep reinforcement learning techniques. Our system can learn quadr uped locomotion from scratch using simple reward signals. In addition, users can provide an open loop reference to guide the learning process when more control over the learned gait is needed. The control policies are learned in a physics simulator and then deployed on real robots. In robotics, policies trained in simulation often do not transfer to the real world. We narrow this reality gap by improving the physics simulator and learning robust policies. We improve the simulation using system identification, developing an accurate actuator model and simulating latency. We learn robust controllers by randomizing the physical environments, adding perturbations and designing a compact observation space. We evaluate our system on two agile locomotion gaits: trotting and galloping. After learning in simulation, a quadruped robot can successfully perform both gaits in the real world.

علم الروبوتات الذكاء الاصطناعي

Adaptation of Quadruped Robot Locomotion with Meta-Learning

96 - Arsen Kuzhamuratov , Dmitry Sorokin , Alexander Ulanov 2021

Animals have remarkable abilities to adapt locomotion to different terrains and tasks. However, robots trained by means of reinforcement learning are typically able to solve only a single task and a transferred policy is usually inferior to that trai ned from scratch. In this work, we demonstrate that meta-reinforcement learning can be used to successfully train a robot capable to solve a wide range of locomotion tasks. The performance of the meta-trained robot is similar to that of a robot that is trained on a single task.

علم الروبوتات التعلم الآلي

Learning Linear Policies for Robust Bipedal Locomotion on Terrains with Varying Slopes

326 - Lokesh Krishna , Utkarsh A. Mishra , Guillermo A. Castillo 2021

In this paper, with a view toward deployment of light-weight control frameworks for bipedal walking robots, we realize end-foot trajectories that are shaped by a single linear feedback policy. We learn this policy via a model-free and a gradient-free learning algorithm, Augmented Random Search (ARS), in the two robot platforms Rabbit and Digit. Our contributions are two-fold: a) By using torso and support plane orientation as inputs, we achieve robust walking on slopes of up to 20 degrees in simulation. b) We demonstrate additional behaviors like walking backwards, stepping-in-place, and recovery from external pushes of up to 120 N. The end result is a robust and a fast feedback control law for bipedal walking on terrains with varying slopes. Towards the end, we also provide preliminary results of hardware transfer to Digit.

علم الروبوتات الذكاء الاصطناعي التعلم الآلي

Review of Quadruped Robots for Dynamic Locomotion

130 - Qiayuan Liao 2020

This review introduces quadruped robots: MITCheetah, HyQ, ANYmal, BigDog, and their mechanical structure, actuation, and control.

علم الروبوتات

First Steps: Latent-Space Control with Semantic Constraints for Quadruped Locomotion

127 - Alexander L. Mitchell , Martin Engelcke , Oiwi Parker Jones 2020

Traditional approaches to quadruped control frequently employ simplified, hand-derived models. This significantly reduces the capability of the robot since its effective kinematic range is curtailed. In addition, kinodynamic constraints are often non -differentiable and difficult to implement in an optimisation approach. In this work, these challenges are addressed by framing quadruped control as optimisation in a structured latent space. A deep generative model captures a statistical representation of feasible joint configurations, whilst complex dynamic and terminal constraints are expressed via high-level, semantic indicators and represented by learned classifiers operating upon the latent space. As a consequence, complex constraints are rendered differentiable and evaluated an order of magnitude faster than analytical approaches. We validate the feasibility of locomotion trajectories optimised using our approach both in simulation and on a real-world ANYmal quadruped. Our results demonstrate that this approach is capable of generating smooth and realisable trajectories. To the best of our knowledge, this is the first time latent space control has been successfully applied to a complex, real robot platform.

علم الروبوتات التعلم الآلي