ﻻ يوجد ملخص باللغة العربية
Information theory has been very successful in obtaining performance limits for various problems such as communication, compression and hypothesis testing. Likewise, stochastic control theory provides a characterization of optimal policies for Partially Observable Markov Decision Processes (POMDPs) using dynamic programming. However, finding optimal policies for these problems is computationally hard in general and thus, heuristic solutions are employed in practice. Deep learning can be used as a tool for designing better heuristics in such problems. In this paper, the problem of active sequential hypothesis testing is considered. The goal is to design a policy that can reliably infer the true hypothesis using as few samples as possible by adaptively selecting appropriate queries. This problem can be modeled as a POMDP and bounds on its value function exist in literature. However, optimal policies have not been identified and various heuristics are used. In this paper, two new heuristics are proposed: one based on deep reinforcement learning and another based on a KL-divergence zero-sum game. These heuristics are compared with state-of-the-art solutions and it is demonstrated using numerical experiments that the proposed heuristics can achieve significantly better performance than existing methods in some scenarios.
We consider the problem of decentralized sequential active hypothesis testing (DSAHT), where two transmitting agents, each possessing a private message, are actively helping a third agent--and each other--to learn the message pair over a discrete mem
We consider a user releasing her data containing some personal information in return of a service. We model users personal information as two correlated random variables, one of them, called the secret variable, is to be kept private, while the other
We introduce Deep Adaptive Design (DAD), a method for amortizing the cost of adaptive Bayesian experimental design that allows experiments to be run in real-time. Traditional sequential Bayesian optimal experimental design approaches require substant
We consider sequential hypothesis testing between two quantum states using adaptive and non-adaptive strategies. In this setting, samples of an unknown state are requested sequentially and a decision to either continue or to accept one of the two hyp
The wireless network is undergoing a trend from onnection of things to connection of intelligence. With data spread over the communication networks and computing capability enhanced on the devices, distributed learning becomes a hot topic in both ind