No Arabic abstract
Object insertion is a classic contact-rich manipulation task. The task remains challenging, especially when considering general objects of unknown geometry, which significantly limits the ability to understand the contact configuration between the object and the environment. We study the problem of aligning the object and environment with a tactile-based feedback insertion policy. The insertion process is modeled as an episodic policy that iterates between insertion attempts followed by pose corrections. We explore different mechanisms to learn such a policy based on Reinforcement Learning. The key contribution of this paper is to demonstrate that it is possible to learn a tactile insertion policy that generalizes across different object geometries, and an ablation study of the key design choices for the learning agent: 1) the type of learning scheme: supervised vs. reinforcement learning; 2) the type of learning schedule: unguided vs. curriculum learning; 3) the type of sensing modality: force/torque (F/T) vs. tactile; and 4) the type of tactile representation: tactile RGB vs. tactile flow. We show that the optimal configuration of the learning agent (RL + curriculum + tactile flow) exposed to 4 training objects yields an insertion policy that inserts 4 novel objects with over 85.0% success rate and within 3~4 attempts. Comparisons between F/T and tactile sensing, shows that while an F/T-based policy learns more efficiently, a tactile-based policy provides better generalization.
We study the problem of using high-resolution tactile sensors to control the insertion of objects in a box-packing scenario. We propose a new system based on a tactile sensor GelSlim for the dense packing task. In this paper, we propose an insertion strategy that leverages tactile sensing to: 1) safely probe the box with the grasped object while monitoring incipient slip to maintain a stable grasp on the object. 2) estimate and correct for residual position uncertainties to insert the object into a designated gap without disturbing the environment. Our proposed methodology is based on two neural networks that estimate the error direction and error magnitude, from a stream of tactile imprints, acquired by two GelSlim fingers, during the insertion process. The system is trained on four objects with basic geometric shapes, which we show generalizes to four other common objects. Based on the estimated positional errors, a heuristic controller iteratively adjusts the position of the object and eventually inserts it successfully without requiring prior knowledge of the geometry of the object. The key insight is that dense tactile feedback contains useful information with respect to the contact interaction between the grasped object and its environment. We achieve high success rate and show that unknown objects can be inserted with an average of 6 attempts of the probe-correct loop. The methods ability to generalize to novel objects makes it a good fit for box packing in warehouse automation.
Robots will be expected to manipulate a wide variety of objects in complex and arbitrary ways as they become more widely used in human environments. As such, the rearrangement of objects has been noted to be an important benchmark for AI capabilities in recent years. We propose NeRP (Neural Rearrangement Planning), a deep learning based approach for multi-step neural object rearrangement planning which works with never-before-seen objects, that is trained on simulation data, and generalizes to the real world. We compare NeRP to several naive and model-based baselines, demonstrating that our approach is measurably better and can efficiently arrange unseen objects in fewer steps and with less planning time. Finally, we demonstrate it on several challenging rearrangement problems in the real world.
A GelSight sensor uses an elastomeric slab covered with a reflective membrane to measure tactile signals. It measures the 3D geometry and contact force information with high spacial resolution, and successfully helped many challenging robot tasks. A previous sensor, based on a semi-specular membrane, produces high resolution but with limited geometry accuracy. In this paper, we describe a new design of GelSight for robot gripper, using a Lambertian membrane and new illumination system, which gives greatly improved geometric accuracy while retaining the compact size. We demonstrate its use in measuring surface normals and reconstructing height maps using photometric stereo. We also use it for the task of slip detection, using a combination of information about relative motions on the membrane surface and the shear distortions. Using a robotic arm and a set of 37 everyday objects with varied properties, we find that the sensor can detect translational and rotational slip in general cases, and can be used to improve the stability of the grasp.
We propose a new technique for pushing an unknown object from an initial configuration to a goal configuration with stability constraints. The proposed method leverages recent progress in differentiable physics models to learn unknown mechanical properties of pushed objects, such as their distributions of mass and coefficients of friction. The proposed learning technique computes the gradient of the distance between predicted poses of objects and their actual observed poses and utilizes that gradient to search for values of the mechanical properties that reduce the reality gap. The proposed approach is also utilized to optimize a policy to efficiently push an object toward the desired goal configuration. Experiments with real objects using a real robot to gather data show that the proposed approach can identify the mechanical properties of heterogeneous objects from a small number of pushing actions.
Robotic touch, particularly when using soft optical tactile sensors, suffers from distortion caused by motion-dependent shear. The manner in which the sensor contacts a stimulus is entangled with the tactile information about the geometry of the stimulus. In this work, we propose a supervised convolutional deep neural network model that learns to disentangle, in the latent space, the components of sensor deformations caused by contact geometry from those due to sliding-induced shear. The approach is validated by reconstructing unsheared tactile images from sheared images and showing they match unsheared tactile images collected with no sliding motion. In addition, the unsheared tactile images give a faithful reconstruction of the contact geometry that is not possible from the sheared data, and robust estimation of the contact pose that can be used for servo control sliding around various 2D shapes. Finally, the contact geometry reconstruction in conjunction with servo control sliding were used for faithful full object reconstruction of various 2D shapes. The methods have broad applicability to deep learning models for robots with a shear-sensitive sense of touch.