No Arabic abstract
Simulators perform an important role in prototyping, debugging and benchmarking new advances in robotics and learning for control. Although many physics engines exist, some aspects of the real-world are harder than others to simulate. One of the aspects that have so far eluded accurate simulation is touch sensing. To address this gap, we present TACTO -- a fast, flexible and open-source simulator for vision-based tactile sensors. This simulator allows to render realistic high-resolution touch readings at hundreds of frames per second, and can be easily configured to simulate different vision-based tactile sensors, including GelSight, DIGIT and OmniTact. In this paper, we detail the principles that drove the implementation of TACTO and how they are reflected in its architecture. We demonstrate TACTO on a perceptual task, by learning to predict grasp stability using touch from 1 million grasps, and on a marble manipulation control task. We believe that TACTO is a step towards the widespread adoption of touch sensing in robotic applications, and to enable machine learning practitioners interested in multi-modal learning and control. TACTO is open-source at https://github.com/facebookresearch/tacto.
In essence, successful grasp boils down to correct responses to multiple contact events between fingertips and objects. In most scenarios, tactile sensing is adequate to distinguish contact events. Due to the nature of high dimensionality of tactile information, classifying spatiotemporal tactile signals using conventional model-based methods is difficult. In this work, we propose to predict and classify tactile signal using deep learning methods, seeking to enhance the adaptability of the robotic grasp system to external event changes that may lead to grasping failure. We develop a deep learning framework and collect 6650 tactile image sequences with a vision-based tactile sensor, and the neural network is integrated into a contact-event-based robotic grasping system. In grasping experiments, we achieved 52% increase in terms of object lifting success rate with contact detection, significantly higher robustness under unexpected loads with slip prediction compared with open-loop grasps, demonstrating that integration of the proposed framework into robotic grasping system substantially improves picking success rate and capability to withstand external disturbances.
Simulation is widely used in robotics for system verification and large-scale data collection. However, simulating sensors, including tactile sensors, has been a long-standing challenge. In this paper, we propose Taxim, a realistic and high-speed simulation model for a vision-based tactile sensor, GelSight. A GelSight sensor uses a piece of soft elastomer as the medium of contact and embeds optical structures to capture the deformation of the elastomer, which infers the geometry and forces applied at the contact surface. We propose an example-based method for simulating GelSight: we simulate the optical response to the deformation with a polynomial look-up table. This table maps the deformed geometries to pixel intensity sampled by the embedded camera. In order to simulate the surface markers motion that is caused by the surface stretch of the elastomer, we apply the linear elastic deformation theory and the superposition principle. The simulation model is calibrated with less than 100 data points from a real sensor. The example-based approach enables the model to easily migrate to other GelSight sensors or its variations. To the best of our knowledge, our simulation framework is the first to incorporate marker motion field simulation that derives from elastomer deformation together with the optical simulation, creating a comprehensive and computationally efficient tactile simulation framework. Experiments reveal that our optical simulation has the lowest pixel-wise intensity errors compared to prior work and can run online with CPU computing.
Despite decades of research, general purpose in-hand manipulation remains one of the unsolved challenges of robotics. One of the contributing factors that limit current robotic manipulation systems is the difficulty of precisely sensing contact forces -- sensing and reasoning about contact forces are crucial to accurately control interactions with the environment. As a step towards enabling better robotic manipulation, we introduce DIGIT, an inexpensive, compact, and high-resolution tactile sensor geared towards in-hand manipulation. DIGIT improves upon past vision-based tactile sensors by miniaturizing the form factor to be mountable on multi-fingered hands, and by providing several design improvements that result in an easier, more repeatable manufacturing process, and enhanced reliability. We demonstrate the capabilities of the DIGIT sensor by training deep neural network model-based controllers to manipulate glass marbles in-hand with a multi-finger robotic hand. To provide the robotic community access to reliable and low-cost tactile sensors, we open-source the DIGIT design at https://digit.ml/.
Monitoring the state of contact is essential for robotic devices, especially grippers that implement gecko-inspired adhesives where intimate contact is crucial for a firm attachment. However, due to the lack of deformable sensors, few have demonstrated tactile sensing for gecko grippers. We present Viko, an adaptive gecko gripper that utilizes vision-based tactile sensors to monitor contact state. The sensor provides high-resolution real-time measurements of contact area and shear force. Moreover, the sensor is adaptive, low-cost, and compact. We integrated gecko-inspired adhesives into the sensor surface without impeding its adaptiveness and performance. Using a robotic arm, we evaluate the performance of the gripper by a series of grasping test. The gripper has a maximum payload of 8N even at a low fingertip pitch angle of 30 degrees. We also showcase the grippers ability to adjust fingertip pose for better contact using sensor feedback. Further, everyday object picking is presented as a demonstration of the grippers adaptiveness.
The distributional perspective on reinforcement learning (RL) has given rise to a series of successful Q-learning algorithms, resulting in state-of-the-art performance in arcade game environments. However, it has not yet been analyzed how these findings from a discrete setting translate to complex practical applications characterized by noisy, high dimensional and continuous state-action spaces. In this work, we propose Quantile QT-Opt (Q2-Opt), a distributional variant of the recently introduced distributed Q-learning algorithm for continuous domains, and examine its behaviour in a series of simulated and real vision-based robotic grasping tasks. The absence of an actor in Q2-Opt allows us to directly draw a parallel to the previous discrete experiments in the literature without the additional complexities induced by an actor-critic architecture. We demonstrate that Q2-Opt achieves a superior vision-based object grasping success rate, while also being more sample efficient. The distributional formulation also allows us to experiment with various risk distortion metrics that give us an indication of how robots can concretely manage risk in practice using a Deep RL control policy. As an additional contribution, we perform batch RL experiments in our virtual environment and compare them with the latest findings from discrete settings. Surprisingly, we find that the previous batch RL findings from the literature obtained on arcade game environments do not generalise to our setup.