ترغب بنشر مسار تعليمي؟ اضغط هنا

Small Variance Asymptotics for Non-Parametric Online Robot Learning

101   0   0.0 ( 0 )
 نشر من قبل Ajay Tanwani
 تاريخ النشر 2016
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Small variance asymptotics is emerging as a useful technique for inference in large scale Bayesian non-parametric mixture models. This paper analyses the online learning of robot manipulation tasks with Bayesian non-parametric mixture models under small variance asymptotics. The analysis yields a scalable online sequence clustering (SOSC) algorithm that is non-parametric in the number of clusters and the subspace dimension of each cluster. SOSC groups the new datapoint in its low dimensional subspace by online inference in a non-parametric mixture of probabilistic principal component analyzers (MPPCA) based on Dirichlet process, and captures the state transition and state duration information online in a hidden semi-Markov model (HSMM) based on hierarchical Dirichlet process. A task-parameterized formulation of our approach autonomously adapts the model to changing environmental situations during manipulation. We apply the algorithm in a teleoperation setting to recognize the intention of the operator and remotely adjust the movement of the robot using the learned model. The generative model is used to synthesize both time-independent and time-dependent behaviours by relying on the principles of shared and autonomous control. Experiments with the Baxter robot yield parsimonious clusters that adapt online with new demonstrations and assist the operator in performing remote manipulation tasks.



قيم البحث

اقرأ أيضاً

Robots frequently need to perceive object attributes, such as red, heavy, and empty, using multimodal exploratory actions, such as look, lift, and shake. Robot attribute learning algorithms aim to learn an observation model for each perceivable attri bute given an exploratory action. Once the attribute models are learned, they can be used to identify attributes of new objects, answering questions, such as Is this object red and empty? Attribute learning and identification are being treated as two separate problems in the literature. In this paper, we first define a new problem called online robot attribute learning (On-RAL), where the robot works on attribute learning and attribute identification simultaneously. Then we develop an algorithm called information-theoretic reward shaping (ITRS) that actively addresses the trade-off between exploration and exploitation in On-RAL problems. ITRS was compared with competitive robot attribute learning baselines, and experimental results demonstrate ITRS superiority in learning efficiency and identification accuracy.
Topic models have emerged as fundamental tools in unsupervised machine learning. Most modern topic modeling algorithms take a probabilistic view and derive inference algorithms based on Latent Dirichlet Allocation (LDA) or its variants. In contrast, we study topic modeling as a combinatorial optimization problem, and propose a new objective function derived from LDA by passing to the small-variance limit. We minimize the derived objective by using ideas from combinatorial optimization, which results in a new, fast, and high-quality topic modeling algorithm. In particular, we show that our results are competitive with popular LDA-based topic modeling approaches, and also discuss the (dis)similarities between our approach and its probabilistic counterparts.
When a robot performs a task next to a human, physical interaction is inevitable: the human might push, pull, twist, or guide the robot. The state-of-the-art treats these interactions as disturbances that the robot should reject or avoid. At best, th ese robots respond safely while the human interacts; but after the human lets go, these robots simply return to their original behavior. We recognize that physical human-robot interaction (pHRI) is often intentional -- the human intervenes on purpose because the robot is not doing the task correctly. In this paper, we argue that when pHRI is intentional it is also informative: the robot can leverage interactions to learn how it should complete the rest of its current task even after the person lets go. We formalize pHRI as a dynamical system, where the human has in mind an objective function they want the robot to optimize, but the robot does not get direct access to the parameters of this objective -- they are internal to the human. Within our proposed framework human interactions become observations about the true objective. We introduce approximations to learn from and respond to pHRI in real-time. We recognize that not all human corrections are perfect: often users interact with the robot noisily, and so we improve the efficiency of robot learning from pHRI by reducing unintended learning. Finally, we conduct simulations and user studies on a robotic manipulator to compare our proposed approach to the state-of-the-art. Our results indicate that learning from pHRI leads to better task performance and improved human satisfaction.
In this paper, we consider the dynamic multi-robot distribution problem where a heterogeneous group of networked robots is tasked to spread out and simultaneously move towards multiple moving task areas while maintaining connectivity. The heterogenei ty of the system is characterized by various categories of units and each robot carries different numbers of units per category representing heterogeneous capabilities. Every task area with different importance demands a total number of units contributed by all of the robots within its area. Moreover, we assume the importance and the total number of units requested from each task area is initially unknown. The robots need first to explore, i.e., reach those areas, and then be allocated to the tasks so to fulfill the requirements. The multi-robot distribution problem is formulated as designing controllers to distribute the robots that maximize the overall task fulfillment while minimizing the traveling costs in presence of connectivity constraints. We propose a novel connectivity-aware multi-robot redistribution approach that accounts for dynamic task allocation and connectivity maintenance for a heterogeneous robot team. Such an approach could generate sub-optimal robot controllers so that the amount of total unfulfilled requirements of the tasks weighted by their importance is minimized and robots stay connected at all times. Simulation and numerical results are provided to demonstrate the effectiveness of the proposed approaches.
Legged robot locomotion requires the planning of stable reference trajectories, especially while traversing uneven terrain. The proposed trajectory optimization framework is capable of generating dynamically stable base and footstep trajectories for multiple steps. The locomotion task can be defined with contact locations, base motion or both, making the algorithm suitable for multiple scenarios (e.g., presence of moving obstacles). The planner uses a simplified momentum-based task space model for the robot dynamics, allowing computation times that are fast enough for online replanning.This fast planning capabilitiy also enables the quadruped to accommodate for drift and environmental changes. The algorithm is tested on simulation and a real robot across multiple scenarios, which includes uneven terrain, stairs and moving obstacles. The results show that the planner is capable of generating stable trajectories in the real robot even when a box of 15 cm height is placed in front of its path at the last moment.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا