A Successive-Elimination Approach to Adaptive Robotic Sensing

77 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل David Fridovich-Keil

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Esther Rolf - David Fridovich-Keil - Max Simchowitz

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We study an adaptive source seeking problem, in which a mobile robot must identify the strongest emitter(s) of a signal in an environment with background emissions. Background signals may be highly heterogeneous and can mislead algorithms that are based on receding horizon control. We propose AdaSearch, a general algorithm for adaptive source seeking in the face of heterogeneous background noise. AdaSearch combines global trajectory planning with principled confidence intervals in order to concentrate measurements in promising regions while guaranteeing sufficient coverage of the entire area. Theoretical analysis shows that AdaSearch confers gains over a uniform sampling strategy when the distribution of background signals is highly variable. Simulation experiments demonstrate that when applied to the problem of radioactive source seeking, AdaSearch outperforms both uniform sampling and a receding time horizon information-maximization approach based on the current literature. We also demonstrate AdaSearch in hardware, providing further evidence of its potential for real-time implementation.

قيم البحث

339 - Henry Zhu , Justin Yu , Abhishek Gupta 2020

The success of reinforcement learning for real world robotics has been, in many cases limited to instrumented laboratory scenarios, often requiring arduous human effort and oversight to enable continuous learning. In this work, we discuss the element s that are needed for a robotic learning system that can continually and autonomously improve with data collected in the real world. We propose a particular instantiation of such a system, using dexterous manipulation as our case study. Subsequently, we investigate a number of challenges that come up when learning without instrumentation. In such settings, learning must be feasible without manually designed resets, using only on-board perception, and without hand-engineered reward functions. We propose simple and scalable solutions to these challenges, and then demonstrate the efficacy of our proposed system on a set of dexterous robotic manipulation tasks, providing an in-depth analysis of the challenges associated with this learning paradigm. We demonstrate that our complete system can learn without any human intervention, acquiring a variety of vision-based skills with a real-world three-fingered hand. Results and videos can be found at https://sites.google.com/view/realworld-rl/

التعلم الآلي علم الروبوتات التعلم الالي

Robotic Table Tennis with Model-Free Reinforcement Learning

110 - Wenbo Gao , Laura Graesser , Krzysztof Choromanski 2020

We propose a model-free algorithm for learning efficient policies capable of returning table tennis balls by controlling robot joints at a rate of 100Hz. We demonstrate that evolutionary search (ES) methods acting on CNN-based policy architectures fo r non-visual inputs and convolving across time learn compact controllers leading to smooth motions. Furthermore, we show that with appropriately tuned curriculum learning on the task and rewards, policies are capable of developing multi-modal styles, specifically forehand and backhand stroke, whilst achieving 80% return rate on a wide range of ball throws. We observe that multi-modality does not require any architectural priors, such as multi-head architectures or hierarchical policies.

التعلم الآلي علم الروبوتات التعلم الالي

A Versatile Approach to Evaluating and Testing Automated Vehicles based on Kernel Methods

108 - Zhiyuan Huang , Yaohui Guo , Henry Lam 2017

Evaluation and validation of complicated control systems are crucial to guarantee usability and safety. Usually, failure happens in some very rarely encountered situations, but once triggered, the consequence is disastrous. Accelerated Evaluation is a methodology that efficiently tests those rarely-occurring yet critical failures via smartly-sampled test cases. The distribution used in sampling is pivotal to the performance of the method, but building a suitable distribution requires case-by-case analysis. This paper proposes a versatile approach for constructing sampling distribution using kernel method. The approach uses statistical learning tools to approximate the critical event sets and constructs distributions based on the unique properties of Gaussian distributions. We applied the method to evaluate the automated vehicles. Numerical experiments show proposed approach can robustly identify the rare failures and significantly reduce the evaluation time.

التعلم الآلي علم الروبوتات التعلم الالي

A Lyapunov-Based Approach to Exploit Asymmetries in Robotic Dual-Arm Task Resolution

134 - Diogo Almeida , Yiannis Karayiannidis 2019

Dual-arm manipulation tasks can be prescribed to a robotic system in terms of desired absolute and relative motion of the robots end-effectors. These can represent, e.g., jointly carrying a rigid object or performing an assembly task. When both types of motion are to be executed concurrently, the symmetric distribution of the relative motion between arms prevents task conflicts. Conversely, an asymmetric solution to the relative motion task will result in conflicts with the absolute task. In this work, we address the problem of designing a control law for the absolute motion task together with updating the distribution of the relative task among arms. Through a set of numerical results, we contrast our approach with the classical symmetric distribution of the relative motion task to illustrate the advantages of our method.

أنظمة وتحكم علم الروبوتات

Adaptive Periodic Averaging: A Practical Approach to Reducing Communication in Distributed Learning

189 - Peng Jiang , Gagan Agrawal 2020

Stochastic Gradient Descent (SGD) is the key learning algorithm for many machine learning tasks. Because of its computational costs, there is a growing interest in accelerating SGD on HPC resources like GPU clusters. However, the performance of paral lel SGD is still bottlenecked by the high communication costs even with a fast connection among the machines. A simple approach to alleviating this problem, used in many existing efforts, is to perform communication every few iterations, using a constant averaging period. In this paper, we show that the optimal averaging period in terms of convergence and communication cost is not a constant, but instead varies over the course of the execution. Specifically, we observe that reducing the variance of model parameters among the computing nodes is critical to the convergence of periodic parameter averaging SGD. Given a fixed communication budget, we show that it is more beneficial to synchronize more frequently in early iterations to reduce the initial large variance and synchronize less frequently in the later phase of the training process. We propose a practical algorithm, named ADaptive Periodic parameter averaging SGD (ADPSGD), to achieve a smaller overall variance of model parameters, and thus better convergence compared with the Constant Periodic parameter averaging SGD (CPSGD). We evaluate our method with several image classification benchmarks and show that our ADPSGD indeed achieves smaller training losses and higher test accuracies with smaller communication compared with CPSGD. Compared with gradient-quantization SGD, we show that our algorithm achieves faster convergence with only half of the communication. Compared with full-communication SGD, our ADPSGD achieves 1:14x to 1:27x speedups with a 100Gbps connection among computing nodes, and the speedups increase to 1:46x ~ 1:95x with a 10Gbps connection.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية التعلم الالي