No Arabic abstract
Robotics and computer vision problems commonly require handling rigid-body motions comprising translation and rotation - together referred to as pose. In some situations, a vectorial parameterization of pose can be useful, where elements of a vector space are surjectively mapped to a matrix Lie group. For example, these vectorial representations can be employed for optimization as well as uncertainty representation on groups. The most common mapping is the matrix exponential, which maps elements of a Lie algebra onto the associated Lie group. However, this choice is not unique. It has been previously shown how to characterize all such vectorial parameterizations for SO(3), the group of rotations. Some results are also known for the group of poses, where it is possible to build a family of vectorial mappings that includes the matrix exponential as well as the Cayley transformation. We extend what is known for these pose mappings to the 4 x 4 representation common in robotics, and also demonstrate three different examples of the proposed pose mappings: (i) pose interpolation, (ii) pose servoing control, and (iii) pose estimation in a pointcloud alignment problem. In the pointcloud alignment problem our results lead to a new algorithm based on the Cayley transformation, which we call CayPer.
Achieving human-like motion in robots has been a fundamental goal in many areas of robotics research. Inverse kinematic (IK) solvers have been explored as a solution to provide kinematic structures with anthropomorphic movements. In particular, numeric solvers based on geometry, such as FABRIK, have shown potential for producing human-like motion at a low computational cost. Nevertheless, these methods have shown limitations when solving for robot kinematic constraints. This work proposes a framework inspired by FABRIK for human pose imitation in real-time. The goal is to mitigate the problems of the original algorithm while retaining the resulting humanlike fluidity and low cost. We first propose a human constraint model for pose imitation. Then, we present a pose imitation algorithm (PIC), and its soft version (PICs) that can successfully imitate human poses using the proposed constraint system. PIC was tested on two collaborative robots (Baxter and YuMi). Fifty human demonstrations were collected for a bi-manual assembly and an incision task. Then, two performance metrics were obtained for both robots: pose accuracy with respect to the human and the percentage of environment occlusion/obstruction. The performance of PIC and PICs was compared against the numerical solver baseline (FABRIK). The proposed algorithms achieve a higher pose accuracy than FABRIK for both tasks (25%-FABRIK, 53%-PICs, 58%-PICs). In addition, PIC and its soft version achieve a lower percentage of occlusion during incision (10%-FABRIK, 4%-PICs, 9%-PICs). These results indicate that the PIC method can reproduce human poses and achieve key desired effects of human imitation.
This paper is a study of 2D manipulation without sensing and planning, by exploring the effects of unplanned randomized action sequences on 2D object pose uncertainty. Our approach follows the work of Erdmann and Masons sensorless reorienting of an object into a completely determined pose, regardless of its initial pose. While Erdmann and Mason proposed a method using Newtonian mechanics, this paper shows that under some circumstances, a long enough sequence of random actions will also converge toward a determined final pose of the object. This is verified through several simulation and real robot experiments where randomized action sequences are shown to reduce entropy of the object pose distribution. The effects of varying object shapes, action sequences, and surface friction are also explored.
This work provides a theoretical framework for the pose estimation problem using total least squares for vector observations from landmark features. First, the optimization framework is formulated for the pose estimation problem with observation vectors extracted from point cloud features. Then, error-covariance expressions are derived. The attitude and position solutions obtained via the derived optimization framework are proven to reach the bounds defined by the Cramer-Rao lower bound under the small angle approximation of attitude errors. The measurement data for the simulation of this problem is provided through a series of vector observation scans, and a fully populated observation noise-covariance matrix is assumed as the weight in the cost function to cover for the most general case of the sensor uncertainty. Here, previous derivations are expanded for the pose estimation problem to include more generic cases of correlations in the errors than previously cases involving an isotropic noise assumption. The proposed solution is simulated in a Monte-Carlo framework with 10,000 samples to validate the error-covariance analysis.
Globally localizing in a given map is a crucial ability for robots to perform a wide range of autonomous navigation tasks. This paper presents OneShot - a global localization algorithm that uses only a single 3D LiDAR scan at a time, while outperforming approaches based on integrating a sequence of point clouds. Our approach, which does not require the robot to move, relies on learning-based descriptors of point cloud segments and computes the full 6 degree-of-freedom pose in a map. The segments are extracted from the current LiDAR scan and are matched against a database using the computed descriptors. Candidate matches are then verified with a geometric consistency test. We additionally present a strategy to further improve the performance of the segment descriptors by augmenting them with visual information provided by a camera. For this purpose, a custom-tailored neural network architecture is proposed. We demonstrate that our LiDAR-only approach outperforms a state-of-the-art baseline on a sequence of the KITTI dataset and also evaluate its performance on the challenging NCLT dataset. Finally, we show that fusing in visual information boosts segment retrieval rates by up to 26% compared to LiDAR-only description.
Recent robotic manipulation competitions have highlighted that sophisticated robots still struggle to achieve fast and reliable perception of task-relevant objects in complex, realistic scenarios. To improve these systems perceptive speed and robustness, we present SegICP, a novel integrated solution to object recognition and pose estimation. SegICP couples convolutional neural networks and multi-hypothesis point cloud registration to achieve both robust pixel-wise semantic segmentation as well as accurate and real-time 6-DOF pose estimation for relevant objects. Our architecture achieves 1cm position error and <5^circ$ angle error in real time without an initial seed. We evaluate and benchmark SegICP against an annotated dataset generated by motion capture.