Planning Multimodal Exploratory Actions for Online Robot Attribute Learning

127 0 0.0 ( 0 )

Download Cite

Added by Xiaohan Zhang

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Xiaohan Zhang - Jivko Sinapov - Shiqi Zhang

Robotics

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Robots frequently need to perceive object attributes, such as red, heavy, and empty, using multimodal exploratory actions, such as look, lift, and shake. Robot attribute learning algorithms aim to learn an observation model for each perceivable attribute given an exploratory action. Once the attribute models are learned, they can be used to identify attributes of new objects, answering questions, such as Is this object red and empty? Attribute learning and identification are being treated as two separate problems in the literature. In this paper, we first define a new problem called online robot attribute learning (On-RAL), where the robot works on attribute learning and attribute identification simultaneously. Then we develop an algorithm called information-theoretic reward shaping (ITRS) that actively addresses the trade-off between exploration and exploitation in On-RAL problems. ITRS was compared with competitive robot attribute learning baselines, and experimental results demonstrate ITRS superiority in learning efficiency and identification accuracy.

rate research

Learning Sampling Distributions for Robot Motion Planning

147 - Brian Ichter , James Harrison , Marco Pavone 2017

A defining feature of sampling-based motion planning is the reliance on an implicit representation of the state space, which is enabled by a set of probing samples. Traditionally, these samples are drawn either probabilistically or deterministically to uniformly cover the state space. Yet, the motion of many robotic systems is often restricted to small regions of the state space, due to, for example, differential constraints or collision-avoidance constraints. To accelerate the planning process, it is thus desirable to devise non-uniform sampling strategies that favor sampling in those regions where an optimal solution might lie. This paper proposes a methodology for non-uniform sampling, whereby a sampling distribution is learned from demonstrations, and then used to bias sampling. The sampling distribution is computed through a conditional variational autoencoder, allowing sample generation from the latent space conditioned on the specific planning problem. This methodology is general, can be used in combination with any sampling-based planner, and can effectively exploit the underlying structure of a planning problem while maintaining the theoretical guarantees of sampling-based approaches. Specifically, on several planning problems, the proposed methodology is shown to effectively learn representations for the relevant regions of the state space, resulting in an order of magnitude improvement in terms of success rate and convergence to the optimal cost.

Robotics Machine Learning

Small Variance Asymptotics for Non-Parametric Online Robot Learning

100 - Ajay Kumar Tanwani , Sylvain Calinon 2016

Small variance asymptotics is emerging as a useful technique for inference in large scale Bayesian non-parametric mixture models. This paper analyses the online learning of robot manipulation tasks with Bayesian non-parametric mixture models under small variance asymptotics. The analysis yields a scalable online sequence clustering (SOSC) algorithm that is non-parametric in the number of clusters and the subspace dimension of each cluster. SOSC groups the new datapoint in its low dimensional subspace by online inference in a non-parametric mixture of probabilistic principal component analyzers (MPPCA) based on Dirichlet process, and captures the state transition and state duration information online in a hidden semi-Markov model (HSMM) based on hierarchical Dirichlet process. A task-parameterized formulation of our approach autonomously adapts the model to changing environmental situations during manipulation. We apply the algorithm in a teleoperation setting to recognize the intention of the operator and remotely adjust the movement of the robot using the learned model. The generative model is used to synthesize both time-independent and time-dependent behaviours by relying on the principles of shared and autonomous control. Experiments with the Baxter robot yield parsimonious clusters that adapt online with new demonstrations and assist the operator in performing remote manipulation tasks.

Robotics

Robot gains Social Intelligence through Multimodal Deep Reinforcement Learning

106 - Ahmed Hussain Qureshi , Yutaka Nakamura , Yuichiro Yoshikawa 2017

For robots to coexist with humans in a social world like ours, it is crucial that they possess human-like social interaction skills. Programming a robot to possess such skills is a challenging task. In this paper, we propose a Multimodal Deep Q-Network (MDQN) to enable a robot to learn human-like interaction skills through a trial and error method. This paper aims to develop a robot that gathers data during its interaction with a human and learns human interaction behaviour from the high-dimensional sensory information using end-to-end reinforcement learning. This paper demonstrates that the robot was able to learn basic interaction skills successfully, after 14 days of interacting with people.

Robotics Artificial Intelligence Computer Vision and Pattern Recognition

Learning Mixed-Integer Convex Optimization Strategies for Robot Planning and Control

298 - A. Cauligi , P. Culbertson , B. Stellato 2020

Mixed-integer convex programming (MICP) has seen significant algorithmic and hardware improvements with several orders of magnitude solve time speedups compared to 25 years ago. Despite these advances, MICP has been rarely applied to real-world robotic control because the solution times are still too slow for online applications. In this work, we extend the machine learning optimizer (MLOPT) framework to solve MICPs arising in robotics at very high speed. MLOPT encodes the combinatorial part of the optimal solution into a strategy. Using data collected from offline problem solutions, we train a multiclass classifier to predict the optimal strategy given problem-specific parameters such as states or obstacles. Compared to previous approaches, we use task-specific strategies and prune redundant ones to significantly reduce the number of classes the predictor has to select from, thereby greatly improving scalability. Given the predicted strategy, the control task becomes a small convex optimization problem that we can solve in milliseconds. Numerical experiments on a cart-pole system with walls, a free-flying space robot and task-oriented grasps show that our method provides not only 1 to 2 orders of magnitude speedups compared to state-of-the-art solvers but also performance close to the globally optimal MICP solution.

Robotics

Multimodal feedback for active robot-object interaction

79 - Luis Contreras , Hiroki Yokoyama , 2018

In this work, we present a multimodal system for active robot-object interaction using laser-based SLAM, RGBD images, and contact sensors. In the object manipulation task, the robot adjusts its initial pose with respect to obstacles and target objects through RGBD data so it can perform object grasping in different configuration spaces while avoiding collisions, and updates the information related to the last steps of the manipulation process using the contact sensors in its hand. We perform a series of experiment to evaluate the performance of the proposed system following the the RoboCup2018 international competition regulations. We compare our approach with a number of baselines, namely a no-feedback method and visual-only and tactile-only feedback methods, where our proposed visual-and-tactile feedback method performs best.

Robotics