ترغب بنشر مسار تعليمي؟ اضغط هنا

Inspired by sensorimotor theory, we propose a novel pipeline for voice-controlled robots. Previous work relies on explicit labels of sounds and images as well as extrinsic reward functions. Not only do such approaches have little resemblance to human sensorimotor development, but also require hand-tuning rewards and extensive human labor. To address these problems, we learn a representation that associates images and sound commands with minimal supervision. Using this representation, we generate an intrinsic reward function to learn robotic tasks with reinforcement learning. We demonstrate our approach on three robot platforms, a TurtleBot3, a Kuka-IIWA arm, and a Kinova Gen3 robot, which hear a command word, identify the associated target object, and perform precise control to approach the target. We show that our method outperforms previous work across various sound types and robotic tasks empirically. We successfully deploy the policy learned in simulator to a real-world Kinova Gen3.
Effective human-vehicle collaboration requires an appropriate un-derstanding of vehicle behavior for safety and trust. Improvingon our prior work by adding a future prediction module, we in-troduce our framework, calledAutoPreview, to enable humans t opreview autopilot behaviors prior to direct interaction with thevehicle. Previewing autopilot behavior can help to ensure smoothhuman-vehicle collaboration during the initial exploration stagewith the vehicle. To demonstrate its practicality, we conducted acase study on human-vehicle collaboration and built a prototypeof our framework with the CARLA simulator. Additionally, weconducted a between-subject control experiment (n=10) to studywhether ourAutoPreviewframework can provide a deeper under-standing of autopilot behavior compared to direct interaction. Ourresults suggest that theAutoPreviewframework does, in fact, helpusers understand autopilot behavior and develop appropriate men-tal models
Given a stochastic dynamical system modelled via stochastic differential equations (SDEs), we evaluate the safety of the system through characterisations of its exit time moments. We lift the (possibly nonlinear) dynamics into the space of the occupa tion and exit measures to obtain a set of linear evolution equations which depend on the infinitesimal generator of the SDE. Coupled with appropriate semidefinite positive matrix constraints, this yields a moment-based approach for the computation of exit time moments of SDEs with polynomial drift and diffusion dynamics. To extend the capability of the moment approach, we propose a state augmentation method which allows us to generate the evolution equations for a broader class of nonlinear stochastic systems and apply the moment method to previously unsupported dynamics. In particular, we show a general augmentation strategy for sinusoidal dynamics which can be found in most physical systems. We employ the methodology on an Ornstein-Uhlenbeck process and stochastic spring-mass-damper model to characterise their safety via their expected exit times and show the additional exit distribution insights that are afforded through higher order moments.
While imitation learning is often used in robotics, the approach frequently suffers from data mismatch and compounding errors. DAgger is an iterative algorithm that addresses these issues by aggregating training data from both the expert and novice p olicies, but does not consider the impact of safety. We present a probabilistic extension to DAgger, which attempts to quantify the confidence of the novice policy as a proxy for safety. Our method, EnsembleDAgger, approximates a Gaussian Process using an ensemble of neural networks. Using the variance as a measure of confidence, we compute a decision rule that captures how much we doubt the novice, thus determining when it is safe to allow the novice to act. With this approach, we aim to maximize the novices share of actions, while constraining the probability of failure. We demonstrate improved safety and learning performance compared to other DAgger variants and classic imitation learning on an inverted pendulum and in the MuJoCo HalfCheetah environment.
While imitation learning is becoming common practice in robotics, this approach often suffers from data mismatch and compounding errors. DAgger is an iterative algorithm that addresses these issues by continually aggregating training data from both t he expert and novice policies, but does not consider the impact of safety. We present a probabilistic extension to DAgger, which uses the distribution over actions provided by the novice policy, for a given observation. Our method, which we call DropoutDAgger, uses dropout to train the novice as a Bayesian neural network that provides insight to its confidence. Using the distribution over the novices actions, we estimate a probabilistic measure of safety with respect to the expert action, tuned to balance exploration and exploitation. The utility of this approach is evaluated on the MuJoCo HalfCheetah and in a simple driving experiment, demonstrating improved performance and safety compared to other DAgger variants and classic imitation learning.
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا