Adaptation of Quadruped Robot Locomotion with Meta-Learning

97 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Arsen Kuzhamuratov

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Arsen Kuzhamuratov - Dmitry Sorokin - Alexander Ulanov

علم الروبوتات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Animals have remarkable abilities to adapt locomotion to different terrains and tasks. However, robots trained by means of reinforcement learning are typically able to solve only a single task and a transferred policy is usually inferior to that trained from scratch. In this work, we demonstrate that meta-reinforcement learning can be used to successfully train a robot capable to solve a wide range of locomotion tasks. The performance of the meta-trained robot is similar to that of a robot that is trained on a single task.

قيم البحث

95 - David DeFazio , Shiqi Zhang 2021

Legged robots have been shown to be effective in navigating unstructured environments. Although there has been much success in learning locomotion policies for quadruped robots, there is little research on how to incorporate human knowledge to facili tate this learning process. In this paper, we demonstrate that human knowledge in the form of LTL formulas can be applied to quadruped locomotion learning within a Reward Machine (RM) framework. Experimental results in simulation show that our RM-based approach enables easily defining diverse locomotion styles, and efficiently learning locomotion policies of the defined styles.

علم الروبوتات الذكاء الاصطناعي

Evolved embodied phase coordination enables robust quadruped robot locomotion

60 - J{o}rgen Nordmoen , T{o}nnes F. Nygaard , Kai Olav Ellefsen andn Kyrre Glette 2019

Overcoming robotics challenges in the real world requires resilient control systems capable of handling a multitude of environments and unforeseen events. Evolutionary optimization using simulations is a promising way to automatically design such con trol systems, however, if the disparity between simulation and the real world becomes too large, the optimization process may result in dysfunctional real-world behaviors. In this paper, we address this challenge by considering embodied phase coordination in the evolutionary optimization of a quadruped robot controller based on central pattern generators. With this method, leg phases, and indirectly also inter-leg coordination, are influenced by sensor feedback.By comparing two very similar control systems we gain insight into how the sensory feedback approach affects the evolved parameters of the control system, and how the performances differs in simulation, in transferal to the real world, and to different real-world environments. We show that evolution enables the design of a control system with embodied phase coordination which is more complex than previously seen approaches, and that this system is capable of controlling a real-world multi-jointed quadruped robot.The approach reduces the performance discrepancy between simulation and the real world, and displays robustness towards new environments.

علم الروبوتات

Learning Fast Adaptation with Meta Strategy Optimization

158 - Wenhao Yu , Jie Tan , Yunfei Bai 2019

The ability to walk in new scenarios is a key milestone on the path toward real-world applications of legged robots. In this work, we introduce Meta Strategy Optimization, a meta-learning algorithm for training policies with latent variable inputs th at can quickly adapt to new scenarios with a handful of trials in the target environment. The key idea behind MSO is to expose the same adaptation process, Strategy Optimization (SO), to both the training and testing phases. This allows MSO to effectively learn locomotion skills as well as a latent space that is suitable for fast adaptation. We evaluate our method on a real quadruped robot and demonstrate successful adaptation in various scenarios, including sim-to-real transfer, walking with a weakened motor, or climbing up a slope. Furthermore, we quantitatively analyze the generalization capability of the trained policy in simulated environments. Both real and simulated experiments show that our method outperforms previous methods in adaptation to novel tasks.

علم الروبوتات التعلم الآلي

First Steps: Latent-Space Control with Semantic Constraints for Quadruped Locomotion

127 - Alexander L. Mitchell , Martin Engelcke , Oiwi Parker Jones 2020

Traditional approaches to quadruped control frequently employ simplified, hand-derived models. This significantly reduces the capability of the robot since its effective kinematic range is curtailed. In addition, kinodynamic constraints are often non -differentiable and difficult to implement in an optimisation approach. In this work, these challenges are addressed by framing quadruped control as optimisation in a structured latent space. A deep generative model captures a statistical representation of feasible joint configurations, whilst complex dynamic and terminal constraints are expressed via high-level, semantic indicators and represented by learned classifiers operating upon the latent space. As a consequence, complex constraints are rendered differentiable and evaluated an order of magnitude faster than analytical approaches. We validate the feasibility of locomotion trajectories optimised using our approach both in simulation and on a real-world ANYmal quadruped. Our results demonstrate that this approach is capable of generating smooth and realisable trajectories. To the best of our knowledge, this is the first time latent space control has been successfully applied to a complex, real robot platform.

علم الروبوتات التعلم الآلي

Sim-to-Real: Learning Agile Locomotion For Quadruped Robots

253 - Jie Tan , Tingnan Zhang , Erwin Coumans 2018

Designing agile locomotion for quadruped robots often requires extensive expertise and tedious manual tuning. In this paper, we present a system to automate this process by leveraging deep reinforcement learning techniques. Our system can learn quadr uped locomotion from scratch using simple reward signals. In addition, users can provide an open loop reference to guide the learning process when more control over the learned gait is needed. The control policies are learned in a physics simulator and then deployed on real robots. In robotics, policies trained in simulation often do not transfer to the real world. We narrow this reality gap by improving the physics simulator and learning robust policies. We improve the simulation using system identification, developing an accurate actuator model and simulating latency. We learn robust controllers by randomizing the physical environments, adding perturbations and designing a compact observation space. We evaluate our system on two agile locomotion gaits: trotting and galloping. After learning in simulation, a quadruped robot can successfully perform both gaits in the real world.

علم الروبوتات الذكاء الاصطناعي