Learning 6-DOF Grasping Interaction via Deep Geometry-aware 3D Representations

151 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Xinchen Yan

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Xinchen Yan - Jasmine Hsu - Mohi Khansari

علم الروبوتات الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper focuses on the problem of learning 6-DOF grasping with a parallel jaw gripper in simulation. We propose the notion of a geometry-aware representation in grasping based on the assumption that knowledge of 3D geometry is at the heart of interaction. Our key idea is constraining and regularizing grasping interaction learning through 3D geometry prediction. Specifically, we formulate the learning of deep geometry-aware grasping model in two steps: First, we learn to build mental geometry-aware representation by reconstructing the scene (i.e., 3D occupancy grid) from RGBD input via generative 3D shape modeling. Second, we learn to predict grasping outcome with its internal geometry-aware representation. The learned outcome prediction model is used to sequentially propose grasping solutions via analysis-by-synthesis optimization. Our contributions are fourfold: (1) To best of our knowledge, we are presenting for the first time a method to learn a 6-DOF grasping net from RGBD input; (2) We build a grasping dataset from demonstrations in virtual reality with rich sensory and interaction annotations. This dataset includes 101 everyday objects spread across 7 categories, additionally, we propose a data augmentation strategy for effective learning; (3) We demonstrate that the learned geometry-aware representation leads to about 10 percent relative performance improvement over the baseline CNN on grasping objects from our dataset. (4) We further demonstrate that the model generalizes to novel viewpoints and object instances.

قيم البحث

95 - Zhenyu Jiang , Yifeng Zhu , Maxwell Svetlik 2021

Grasp detection in clutter requires the robot to reason about the 3D scene from incomplete and noisy perception. In this work, we draw insight that 3D reconstruction and grasp learning are two intimately connected tasks, both of which require a fine- grained understanding of local geometry details. We thus propose to utilize the synergies between grasp affordance and 3D reconstruction through multi-task learning of a shared representation. Our model takes advantage of deep implicit functions, a continuous and memory-efficient representation, to enable differentiable training of both tasks. We train the model on self-supervised grasp trials data in simulation. Evaluation is conducted on a clutter removal task, where the robot clears cluttered objects by grasping them one at a time. The experimental results in simulation and on the real robot have demonstrated that the use of implicit neural representations and joint learning of grasp affordance and 3D reconstruction have led to state-of-the-art grasping results. Our method outperforms baselines by over 10% in terms of grasp success rate. Additional results and videos can be found at https://sites.google.com/view/rpl-giga2021

علم الروبوتات الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Deep 6-DoF Tracking of Unknown Objects for Reactive Grasping

132 - Marc Tuscher , Julian Horz , Danny Driess 2021

Robotic manipulation of unknown objects is an important field of research. Practical applications occur in many real-world settings where robots need to interact with an unknown environment. We tackle the problem of reactive grasping by proposing a m ethod for unknown object tracking, grasp point sampling and dynamic trajectory planning. Our object tracking method combines Siamese Networks with an Iterative Closest Point approach for pointcloud registration into a method for 6-DoF unknown object tracking. The method does not require further training and is robust to noise and occlusion. We propose a robotic manipulation system, which is able to grasp a wide variety of formerly unseen objects and is robust against object perturbations and inferior grasping points.

علم الروبوتات الذكاء الاصطناعي

Grasp2Vec: Learning Object Representations from Self-Supervised Grasping

115 - Eric Jang , Coline Devin , Vincent Vanhoucke 2018

Well structured visual representations can make robot learning faster and can improve generalization. In this paper, we study how we can acquire effective object-centric representations for robotic manipulation tasks without human labeling by using a utonomous robot interaction with the environment. Such representation learning methods can benefit from continuous refinement of the representation as the robot collects more experience, allowing them to scale effectively without human intervention. Our representation learning approach is based on object persistence: when a robot removes an object from a scene, the representation of that scene should change according to the features of the object that was removed. We formulate an arithmetic relationship between feature vectors from this observation, and use it to learn a representation of scenes and objects that can then be used to identify object instances, localize them in the scene, and perform goal-directed grasping tasks where the robot must retrieve commanded objects from a bin. The same grasping procedure can also be used to automatically collect training data for our method, by recording images of scenes, grasping and removing an object, and recording the outcome. Our experiments demonstrate that this self-supervised approach for tasked grasping substantially outperforms direct reinforcement learning from images and prior representation learning methods.

علم الروبوتات الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning

151 - Andy Zeng , Shuran Song , Stefan Welker 2018

Skilled robotic manipulation benefits from complex synergies between non-prehensile (e.g. pushing) and prehensile (e.g. grasping) actions: pushing can help rearrange cluttered objects to make space for arms and fingers; likewise, grasping can help di splace objects to make pushing movements more precise and collision-free. In this work, we demonstrate that it is possible to discover and learn these synergies from scratch through model-free deep reinforcement learning. Our method involves training two fully convolutional networks that map from visual observations to actions: one infers the utility of pushes for a dense pixel-wise sampling of end effector orientations and locations, while the other does the same for grasping. Both networks are trained jointly in a Q-learning framework and are entirely self-supervised by trial and error, where rewards are provided from successful grasps. In this way, our policy learns pushing motions that enable future grasps, while learning grasps that can leverage past pushes. During picking experiments in both simulation and real-world scenarios, we find that our system quickly learns complex behaviors amid challenging cases of clutter, and achieves better grasping success rates and picking efficiencies than baseline alternatives after only a few hours of training. We further demonstrate that our method is capable of generalizing to novel objects. Qualitative results (videos), code, pre-trained models, and simulation environments are available at http://vpg.cs.princeton.edu

علم الروبوتات الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Learning Continuous 3D Reconstructions for Geometrically Aware Grasping

144 - Mark Van der Merwe , Qingkai Lu , Balakumar Sundaralingam 2019

Deep learning has enabled remarkable improvements in grasp synthesis for previously unseen objects from partial object views. However, existing approaches lack the ability to explicitly reason about the full 3D geometry of the object when selecting a grasp, relying on indirect geometric reasoning derived when learning grasp success networks. This abandons explicit geometric reasoning, such as avoiding undesired robot object collisions. We propose to utilize a novel, learned 3D reconstruction to enable geometric awareness in a grasping system. We leverage the structure of the reconstruction network to learn a grasp success classifier which serves as the objective function for a continuous grasp optimization. We additionally explicitly constrain the optimization to avoid undesired contact, directly using the reconstruction. We examine the role of geometry in grasping both in the training of grasp metrics and through 96 robot grasping trials. Our results can be found on https://sites.google.com/view/reconstruction-grasp/.

علم الروبوتات الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي