ترغب بنشر مسار تعليمي؟ اضغط هنا

Contact Mode Guided Motion Planning for Quasidynamic Dexterous Manipulation in 3D

147   0   0.0 ( 0 )
 نشر من قبل Xianyi Cheng
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper presents Contact Mode Guided Manipulation Planning (CMGMP) for general 3D quasistatic and quasidynamic rigid body motion planning in dexterous manipulation. The CMGMP algorithm generates hybrid motion plans including both continuous state transitions and discrete contact mode switches, without the need for pre-specified contact sequences or pre-designed motion primitives. The key idea is to use automatically enumerated contact modes to guide the tree expansions during the search. Contact modes automatically synthesize manipulation primitives, while the sampling-based planning framework sequences those primitives into a coherent plan. We test our algorithm on many simulated 3D manipulation tasks, and validate our models by executing the plans open-loop on a real robot-manipulator system.

قيم البحث

اقرأ أيضاً

The discontinuities and multi-modality introduced by contacts make manipulation planning challenging. Many previous works avoid this problem by pre-designing a set of high-level motion primitives like grasping and pushing. However, such motion primit ives are often not adequate to describe dexterous manipulation motions. In this work, we propose a method for dexterous manipulation planning at a more primitive level. The key idea is to use contact modes to guide the search in a sampling-based planning framework. Our method can automatically generate contact transitions and motion trajectories under the quasistatic assumption. In the experiments, this method sometimes generates motions that are often pre-designed as motion primitives, as well as dexterous motions that are more task-specific.
Dexterous manipulation has been a long-standing challenge in robotics. Recently, modern model-free RL has demonstrated impressive results on a number of problems. However, complex domains like dexterous manipulation remain a challenge for RL due to t he poor sample complexity. To address this, current approaches employ expert demonstrations in the form of state-action pairs, which are difficult to obtain for real-world settings such as learning from videos. In this work, we move toward a more realistic setting and explore state-only imitation learning. To tackle this setting, we train an inverse dynamics model and use it to predict actions for state-only demonstrations. The inverse dynamics model and the policy are trained jointly. Our method performs on par with state-action approaches and considerably outperforms RL alone. By not relying on expert actions, we are able to learn from demonstrations with different dynamics, morphologies, and objects.
Sampling-based motion planners rely on incremental densification to discover progressively shorter paths. After computing feasible path $xi$ between start $x_s$ and goal $x_t$, the Informed Set (IS) prunes the configuration space $mathcal{C}$ by cons ervatively eliminating points that cannot yield shorter paths. Densification via sampling from this Informed Set retains asymptotic optimality of sampling from the entire configuration space. For path length $c(xi)$ and Euclidean heuristic $h$, $IS = { x | x in mathcal{C}, h(x_s, x) + h(x, x_t) leq c(xi) }$. Relying on the heuristic can render the IS especially conservative in high dimensions or complex environments. Furthermore, the IS only shrinks when shorter paths are discovered. Thus, the computational effort from each iteration of densification and planning is wasted if it fails to yield a shorter path, despite improving the cost-to-come for vertices in the search tree. Our key insight is that even in such a failure, shorter paths to vertices in the search tree (rather than just the goal) can immediately improve the planners sampling strategy. Guided Incremental Local Densification (GuILD) leverages this information to sample from Local Subsets of the IS. We show that GuILD significantly outperforms uniform sampling of the Informed Set in simulated $mathbb{R}^2$, $SE(2)$ environments and manipulation tasks in $mathbb{R}^7$.
This report describes our approach for Phase 3 of the Real Robot Challenge. To solve cuboid manipulation tasks of varying difficulty, we decompose each task into the following primitives: moving the fingers to the cuboid to grasp it, turning it on th e table to minimize orientation error, and re-positioning it to the goal position. We use model-based trajectory optimization and control to plan and execute these primitives. These grasping, turning, and re-positioning primitives are sequenced with a state-machine that determines which primitive to execute given the current object state and goal. Our method shows robust performance over multiple runs with randomized initial and goal positions. With this approach, our team placed second in the challenge, under the anonymous name sombertortoise on the leaderboard. Example runs of our method solving each of the four levels can be seen in this video (https://www.youtube.com/watch?v=I65Kwu9PGmg&list=PLt9QxrtaftrHGXcp4Oh8-s_OnQnBnLtei&index=1).
Learning dexterous manipulation in high-dimensional state-action spaces is an important open challenge with exploration presenting a major bottleneck. Although in many cases the learning process could be guided by demonstrations or other suboptimal e xperts, current RL algorithms for continuous action spaces often fail to effectively utilize combinations of highly off-policy expert data and on-policy exploration data. As a solution, we introduce Relative Entropy Q-Learning (REQ), a simple policy iteration algorithm that combines ideas from successful offline and conventional RL algorithms. It represents the optimal policy via importance sampling from a learned prior and is well-suited to take advantage of mixed data distributions. We demonstrate experimentally that REQ outperforms several strong baselines on robotic manipulation tasks for which suboptimal experts are available. We show how suboptimal experts can be constructed effectively by composing simple waypoint tracking controllers, and we also show how learned primitives can be combined with waypoint controllers to obtain reference behaviors to bootstrap a complex manipulation task on a simulated bimanual robot with human-like hands. Finally, we show that REQ is also effective for general off-policy RL, offline RL, and RL from demonstrations. Videos and further materials are available at sites.google.com/view/rlfse.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا