مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

On Decentralized Estimation with Active Queries

36 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Theodoros Tsiligkaridis

تاريخ النشر 2013

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Theodoros Tsiligkaridis - Brian M. Sadler - Alfred O. Hero III

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider the problem of decentralized 20 questions with noise for multiple players/agents under the minimum entropy criterion in the setting of stochastic search over a parameter space, with application to target localization. We propose decentralized extensions of the active query-based stochastic search strategy that combines elements from the 20 questions approach and social learning. We prove convergence to correct consensus on the value of the parameter. This framework provides a flexible and tractable mathematical model for decentralized parameter estimation systems based on active querying. We illustrate the effectiveness and robustness of the proposed decentralized collaborative 20 questions algorithm for random network topologies with information sharing.

قيم البحث

170 - Zengyi Qin , Kaiqing Zhang , Yuxiao Chen 2021

We study the multi-agent safe control problem where agents should avoid collisions to static obstacles and collisions with each other while reaching their goals. Our core idea is to learn the multi-agent control policy jointly with learning the contr ol barrier functions as safety certificates. We propose a novel joint-learning framework that can be implemented in a decentralized fashion, with generalization guarantees for certain function classes. Such a decentralized framework can adapt to an arbitrarily large number of agents. Building upon this framework, we further improve the scalability by incorporating neural network architectures that are invariant to the quantity and permutation of neighboring agents. In addition, we propose a new spontaneous policy refinement method to further enforce the certificate condition during testing. We provide extensive experiments to demonstrate that our method significantly outperforms other leading multi-agent control approaches in terms of maintaining safety and completing original tasks. Our approach also shows exceptional generalization capability in that the control policy can be trained with 8 agents in one scenario, while being used on other scenarios with up to 1024 agents in complex multi-agent environments and dynamics.

أنظمة متعددة العملاء الذكاء الاصطناعي أنظمة وتحكم

Active Trajectory Estimation for Partially Observed Markov Decision Processes via Conditional Entropy

60 - Timothy L. Molloy , Girish N. Nair 2021

In this paper, we consider the problem of controlling a partially observed Markov decision process (POMDP) in order to actively estimate its state trajectory over a fixed horizon with minimal uncertainty. We pose a novel active smoothing problem in w hich the objective is to directly minimise the smoother entropy, that is, the conditional entropy of the (joint) state trajectory distribution of concern in fixed-interval Bayesian smoothing. Our formulation contrasts with prior active approaches that minimise the sum of conditional entropies of the (marginal) state estimates provided by Bayesian filters. By establishing a novel form of the smoother entropy in terms of the POMDP belief (or information) state, we show that our active smoothing problem can be reformulated as a (fully observed) Markov decision process with a value function that is concave in the belief state. The concavity of the value function is of particular importance since it enables the approximate solution of our active smoothing problem using piecewise-linear function approximations in conjunction with standard POMDP solvers. We illustrate the approximate solution of our active smoothing problem in simulation and compare its performance to alternative approaches based on minimising marginal state estimate uncertainties.

أنظمة وتحكم نظرية المعلومات أنظمة وتحكم

Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation

128 - Jun Sun , Gang Wang , Georgios B. Giannakis 2019

Motivated by the emerging use of multi-agent reinforcement learning (MARL) in engineering applications such as networked robotics, swarming drones, and sensor networks, we investigate the policy evaluation problem in a fully decentralized setting, us ing temporal-difference (TD) learning with linear function approximation to handle large state spaces in practice. The goal of a group of agents is to collaboratively learn the value function of a given policy from locally private rewards observed in a shared environment, through exchanging local estimates with neighbors. Despite their simplicity and widespread use, our theoretical understanding of such decentralized TD learning algorithms remains limited. Existing results were obtained based on i.i.d. data samples, or by imposing an `additional projection step to control the `gradient bias incurred by the Markovian observations. In this paper, we provide a finite-sample analysis of the fully decentralized TD(0) learning under both i.i.d. as well as Markovian samples, and prove that all local estimates converge linearly to a small neighborhood of the optimum. The resultant error bounds are the first of its type---in the sense that they hold under the most practical assumptions ---which is made possible by means of a novel multi-step Lyapunov analysis.

التعلم الآلي نظرية المعلومات أنظمة وتحكم

Distributed Parameter Estimation in Sensor Networks: Nonlinear Observation Models and Imperfect Communication

220 - Soummya Kar , Jose M.F.Moura , Kavita Ramanan 2012

The paper studies distributed static parameter (vector) estimation in sensor networks with nonlinear observation models and noisy inter-sensor communication. It introduces emph{separably estimable} observation models that generalize the observability condition in linear centralized estimation to nonlinear distributed estimation. It studies two distributed estimation algorithms in separably estimable models, the $mathcal{NU}$ (with its linear counterpart $mathcal{LU}$) and the $mathcal{NLU}$. Their update rule combines a emph{consensus} step (where each sensor updates the state by weight averaging it with its neighbors states) and an emph{innovation} step (where each sensor processes its local current observation.) This makes the three algorithms of the textit{consensus + innovations} type, very different from traditional consensus. The paper proves consistency (all sensors reach consensus almost surely and converge to the true parameter value,) efficiency, and asymptotic unbiasedness. For $mathcal{LU}$ and $mathcal{NU}$, it proves asymptotic normality and provides convergence rate guarantees. The three algorithms are characterized by appropriately chosen decaying weight sequences. Algorithms $mathcal{LU}$ and $mathcal{NU}$ are analyzed in the framework of stochastic approximation theory; algorithm $mathcal{NLU}$ exhibits mixed time-scale behavior and biased perturbations, and its analysis requires a different approach that is developed in the paper.

أنظمة متعددة العملاء نظرية المعلومات نظرية المعلومات

State Estimation over Sensor Networks with Correlated Wireless Fading Channels

76 - Daniel E. Quevedo , Anders Ahlen , Karl H. Johansson 2013

Stochastic stability for centralized time-varying Kalman filtering over a wireles ssensor network with correlated fading channels is studied. On their route to the gateway, sensor packets, possibly aggregated with measurements from several nodes, may be dropped because of fading links. To study this situation, we introduce a network state process, which describes a finite set of configurations of the radio environment. The network state characterizes the channel gain distributions of the links, which are allowed to be correlated between each other. Temporal correlations of channel gains are modeled by allowing the network state process to form a (semi-)Markov chain. We establish sufficient conditions that ensure the Kalman filter to be exponentially bounded. In the one-sensor case, this new stability condition is shown to include previous results obtained in the literature as special cases. The results also hold when using power and bit-rate control policies, where the transmission power and bit-rate of each node are nonlinear mapping of the network state and channel gains.

التحسين والتحكم نظرية المعلومات أنظمة وتحكم

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة السورية الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

On Decentralized Estimation with Active Queries

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً