Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models

292 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ajay Tanwani

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ajay Kumar Tanwani - Jonathan Lee - Brijen Thananjeyan

علم الروبوتات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Generalizing manipulation skills to new situations requires extracting invariant patterns from demonstrations. For example, the robot needs to understand the demonstrations at a higher level while being invariant to the appearance of the objects, geometric aspects of objects such as its position, size, orientation and viewpoint of the observer in the demonstrations. In this paper, we propose an algorithm that learns a joint probability density function of the demonstrations with invariant formulations of hidden semi-Markov models to extract invariant segments (also termed as sub-goals or options), and smoothly follow the generated sequence of states with a linear quadratic tracking controller. The algorithm takes as input the demonstrations with respect to different coordinate systems describing virtual landmarks or objects of interest with a task-parameterized formulation, and adapt the segments according to the environmental changes in a systematic manner. We present variants of this algorithm in latent space with low-rank covariance decompositions, semi-tied covariances, and non-parametric online estimation of model parameters under small variance asymptotics; yielding considerably low sample and model complexity for acquiring new manipulation skills. The algorithm allows a Baxter robot to learn a pick-and-place task while avoiding a movable obstacle based on only 4 kinesthetic demonstrations.

قيم البحث

83 - Zhao-Heng Yin , Lingfeng Sun , Hengbo Ma 2021

Animals are able to imitate each others behavior, despite their difference in biomechanics. In contrast, imitating the other similar robots is a much more challenging task in robotics. This problem is called cross domain imitation learning~(CDIL). In this paper, we consider CDIL on a class of similar robots. We tackle this problem by introducing an imitation learning algorithm based on invariant representation. We propose to learn invariant state and action representations, which aligns the behavior of multiple robots so that CDIL becomes possible. Compared with previous invariant representation learning methods for similar purpose, our method does not require human-labeled pairwise data for training. Instead, we use cycle-consistency and domain confusion to align the representation and increase its robustness. We test the algorithm on multiple robots in simulator and show that unseen new robot instances can be trained with existing expert demonstrations successfully. Qualitative results also demonstrate that the proposed method is able to learn similar representations for different robots with similar behaviors, which is essential for successful CDIL.

علم الروبوتات التعلم الآلي

Reinforcement Learning for Robot Navigation with Adaptive ExecutionDuration (AED) in a Semi-Markov Model

206 - Yuan Chen , Ruosong Ye , Ziyang Tao 2021

Deep reinforcement learning (DRL) algorithms have proven effective in robot navigation, especially in unknown environments, through directly mapping perception inputs into robot control commands. Most existing methods adopt uniform execution duration with robots taking commands at fixed intervals. As such, the length of execution duration becomes a crucial parameter to the navigation algorithm. In particular, if the duration is too short, then the navigation policy would be executed at a high frequency, with increased training difficulty and high computational cost. Meanwhile, if the duration is too long, then the policy becomes unable to handle complex situations, like those with crowded obstacles. It is thus tricky to find the sweet duration range; some duration values may render a DRL model to fail to find a navigation path. In this paper, we propose to employ adaptive execution duration to overcome this problem. Specifically, we formulate the navigation task as a Semi-Markov Decision Process (SMDP) problem to handle adaptive execution duration. We also improve the distributed proximal policy optimization (DPPO) algorithm and provide its theoretical guarantee for the specified SMDP problem. We evaluate our approach both in the simulator and on an actual robot. The results show that our approach outperforms the other DRL-based method (with fixed execution duration) by 10.3% in terms of the navigation success rate.

علم الروبوتات الذكاء الاصطناعي

Invariant Feature Mappings for Generalizing Affordance Understanding Using Regularized Metric Learning

56 - Martin Hjelm , Carl Henrik Ek , Renaud Detry 2019

This paper presents an approach for learning invariant features for object affordance understanding. One of the major problems for a robotic agent acquiring a deeper understanding of affordances is finding sensory-grounded semantics. Being able to un derstand what in the representation of an object makes the object afford an action opens up for more efficient manipulation, interchange of objects that visually might not be similar, transfer learning, and robot to human communication. Our approach uses a metric learning algorithm that learns a feature transform that encourages objects that affords the same action to be close in the feature space. We regularize the learning, such that we penalize irrelevant features, allowing the agent to link what in the sensory input caused the object to afford the action. From this, we show how the agent can abstract the affordance and reason about the similarity between different affordances.

علم الروبوتات

Scale invariant robot behavior with fractals

115 - Sam Kriegman , Amir Mohammadi Nasab , Douglas Blackiston 2021

Robots deployed at orders of magnitude different size scales, and that retain the same desired behavior at any of those scales, would greatly expand the environments in which the robots could operate. However it is currently not known whether such ro bots exist, and, if they do, how to design them. Since self similar structures in nature often exhibit self similar behavior at different scales, we hypothesize that there may exist robot designs that have the same property. Here we demonstrate that this is indeed the case for some, but not all, modular soft robots: there are robot designs that exhibit a desired behavior at a small size scale, and if copies of that robot are attached together to realize the same design at higher scales, those larger robots exhibit similar behavior. We show how to find such designs in simulation using an evolutionary algorithm. Further, when fractal attachment is not assumed and attachment geometries must thus be evolved along with the design of the base robot unit, scale invariant behavior is not achieved, demonstrating that structural self similarity, when combined with appropriate designs, is a useful path to realizing scale invariant robot behavior. We validate our findings by demonstrating successful transferal of self similar structure and behavior to pneumatically-controlled soft robots. Finally, we show that biobots can spontaneously exhibit self similar attachment geometries, thereby suggesting that self similar behavior via self similar structure may be realizable across a wide range of robot platforms in future.

علم الروبوتات الذكاء الاصطناعي

Hidden Semi-Markov Models for Single-Molecule Conformational Dynamics

162 - A. Kovalev , N. Zarrabi , F. Werz 2009

The conformational kinetics of enzymes can be reliably revealed when they are governed by Markovian dynamics. Hidden Markov Models (HMMs) are appropriate especially in the case of conformational states that are hardly distinguishable. However, the ev olution of the conformational states of proteins mostly shows non-Markovian behavior, recognizable by non-monoexponential state dwell time histograms. The application of a Hidden Markov Model technique to a cyclic system demonstrating semi-Markovian dynamics is presented in this paper and the required extension of the model design is discussed. As standard ranking criteria of models cannot deal with these systems properly, a new approach is proposed considering the shape of the dwell time histograms. We observed the rotational kinetics of a single F1-ATPase alpha3beta3gamma sub-complex over six orders of magnitude of different ATP to ADP and Pi concentration ratios, and established a general model describing the kinetics for the entire range of concentrations. The HMM extension described here is applicable in general to the accurate analysis of protein dynamics.

الأساليب الكمية الجزيئات الحيوية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد الوطني الجزائري للبحث الزراعي

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Generalizing Robot Imitation Learning with Invariant Hidden Semi-Markov Models

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً