Simultaneous active parameter estimation and control using sampling-based Bayesian reinforcement learning

111 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Patrick Slade

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Patrick Slade - Preston Culbertson - Zachary Sunberg

أنظمة وتحكم

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Robots performing manipulation tasks must operate under uncertainty about both their pose and the dynamics of the system. In order to remain robust to modeling error and shifts in payload dynamics, agents must simultaneously perform estimation and control tasks. However, the optimal estimation actions are often not the optimal actions for accomplishing the control tasks, and thus agents trade between exploration and exploitation. This work frames the problem as a Bayes-adaptive Markov decision process and solves it online using Monte Carlo tree search and an extended Kalman filter to handle Gaussian process noise and parameter uncertainty in a continuous space. MCTS selects control actions to reduce model uncertainty and reach the goal state nearly optimally. Certainty equivalent model predictive control is used as a benchmark to compare performance in simulations with varying process noise and parameter uncertainty.

قيم البحث

244 - Chang Fu , Zhe Yu , Di Shi 2019

Accurate identification of parameters of load models is essential in power system computations, including simulation, prediction, and stability and reliability analysis. Conventional point estimation based composite load modeling approaches suffer fr om disturbances and noises and provide limited information of the system dynamics. In this work, a statistic (Bayesian Estimation) based distribution estimation approach is proposed for both static (ZIP) and dynamic (Induction Motor) load modeling. When dealing with multiple parameters, Gibbs sampling method is employed. In each iteration, the proposal samples each parameter while keeps others fixed. The proposed method provides a distribution estimation of load models coefficients and is robust to measurement errors.

أنظمة وتحكم

Learning-based Control of Unknown Linear Systems with Thompson Sampling

86 - Yi Ouyang , Mukul Gagrani , Rahul Jain 2017

We propose a Thompson sampling-based learning algorithm for the Linear Quadratic (LQ) control problem with unknown system parameters. The algorithm is called Thompson sampling with dynamic episodes (TSDE) where two stopping criteria determine the len gths of the dynamic episodes in Thompson sampling. The first stopping criterion controls the growth rate of episode length. The second stopping criterion is triggered when the determinant of the sample covariance matrix is less than half of the previous value. We show under some conditions on the prior distribution that the expected (Bayesian) regret of TSDE accumulated up to time T is bounded by O(sqrt{T}). Here O(.) hides constants and logarithmic factors. This is the first O(sqrt{T} ) bound on expected regret of learning in LQ control. By introducing a reinitialization schedule, we also show that the algorithm is robust to time-varying drift in model parameters. Numerical simulations are provided to illustrate the performance of TSDE.

أنظمة وتحكم

Generalizable control for quantum parameter estimation through reinforcement learning

123 - Han Xu , Junning Li , Liqiang Liu 2019

Measurement and estimation of parameters are essential for science and engineering, where one of the main quests is to find systematic schemes that can achieve high precision. While conventional schemes for quantum parameter estimation focus on the o ptimization of the probe states and measurements, it has been recently realized that control during the evolution can significantly improve the precision. The identification of optimal controls, however, is often computationally demanding, as typically the optimal controls depend on the value of the parameter which then needs to be re-calculated after the update of the estimation in each iteration. Here we show that reinforcement learning provides an efficient way to identify the controls that can be employed to improve the precision. We also demonstrate that reinforcement learning is highly generalizable, namely the neural network trained under one particular value of the parameter can work for different values within a broad range. These desired features make reinforcement learning an efficient alternative to conventional optimal quantum control methods.

فيزياء الكم الفيزياء ميسكالي وننكالي

Reinforcement Learning for Traffic Control with Adaptive Horizon

69 - Wentao Chen , Tehuan Chen , Guang Lin 2019

This paper proposes a reinforcement learning approach for traffic control with the adaptive horizon. To build the controller for the traffic network, a Q-learning-based strategy that controls the green light passing time at the network intersections is applied. The controller includes two components: the regular Q-learning controller that controls the traffic light signal, and the adaptive controller that continuously optimizes the action space for the Q-learning algorithm in order to improve the efficiency of the Q-learning algorithm. The regular Q-learning controller uses the control cost function as a reward function to determine the action to choose. The adaptive controller examines the control cost and updates the action space of the controller by determining the subset of actions that are most likely to obtain optimal results and shrinking the action space to that subset. Uncertainties in traffic influx and turning rate are introduced to test the robustness of the controller under a stochastic environment. Compared with those with model predictive control (MPC), the results show that the proposed Q-learning-based controller outperforms the MPC method by reaching a stable solution in a shorter period and achieves lower control costs. The proposed Q-learning-based controller is also robust under 30% traffic demand uncertainty and 15% turning rate uncertainty.

أنظمة وتحكم

Bayesian parameter estimation using Gaussian states and measurements

139 - Simon Morelli , Ayaka Usui , Elizabeth Agudelo 2020

Bayesian analysis is a framework for parameter estimation that applies even in uncertainty regimes where the commonly used local (frequentist) analysis based on the Cramer-Rao bound is not well defined. In particular, it applies when no initial infor mation about the parameter value is available, e.g., when few measurements are performed. Here, we consider three paradigmatic estimation schemes in continuous-variable quantum metrology (estimation of displacements, phases, and squeezing strengths) and analyse them from the Bayesian perspective. For each of these scenarios, we investigate the precision achievable with single-mode Gaussian states under homodyne and heterodyne detection. This allows us to identify Bayesian estimation strategies that combine good performance with the potential for straightforward experimental realization in terms of Gaussian states and measurements. Our results provide practical solutions for reaching uncertainties where local estimation techniques apply, thus bridging the gap to regimes where asymptotically optimal strategies can be employed.

فيزياء الكم

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشام الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Simultaneous active parameter estimation and control using sampling-based Bayesian reinforcement learning

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً