New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Policy Design for Active Sequential Hypothesis Testing using Deep Learning

105 0 0.0 ( 0 )

Download Cite

Added by Ekraam Sabir

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Dhruva Kartik - Ekraam Sabir - Urbashi Mitra

Information Theory Artificial Intelligence Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Information theory has been very successful in obtaining performance limits for various problems such as communication, compression and hypothesis testing. Likewise, stochastic control theory provides a characterization of optimal policies for Partially Observable Markov Decision Processes (POMDPs) using dynamic programming. However, finding optimal policies for these problems is computationally hard in general and thus, heuristic solutions are employed in practice. Deep learning can be used as a tool for designing better heuristics in such problems. In this paper, the problem of active sequential hypothesis testing is considered. The goal is to design a policy that can reliably infer the true hypothesis using as few samples as possible by adaptively selecting appropriate queries. This problem can be modeled as a POMDP and bounds on its value function exist in literature. However, optimal policies have not been identified and various heuristics are used. In this paper, two new heuristics are proposed: one based on deep reinforcement learning and another based on a KL-divergence zero-sum game. These heuristics are compared with state-of-the-art solutions and it is demonstrated using numerical experiments that the proposed heuristics can achieve significantly better performance than existing methods in some scenarios.

rate research

Decentralized sequential active hypothesis testing and the MAC feedback capacity

114 - Achilleas Anastasopoulos , Sandeep Pradhan 2020

We consider the problem of decentralized sequential active hypothesis testing (DSAHT), where two transmitting agents, each possessing a private message, are actively helping a third agent--and each other--to learn the message pair over a discrete memoryless multiple access channel (DM-MAC). The third agent (receiver) observes the noisy channel output, which is also available to the transmitting agents via noiseless feedback. We formulate this problem as a decentralized dynamic team, show that optimal transmission policies have a time-invariant domain, and characterize the solution through a dynamic program. Several alternative formulations are discussed involving time-homogenous cost functions and/or variable-length codes, resulting in solutions described through fixed-point, Bellman-type equations. Subsequently, we make connections with the problem of simplifying the multi-letter capacity expressions for the noiseless feedback capacity of the DM-MAC. We show that restricting attention to distributions induced by optimal transmission schemes for the DSAHT problem, without loss of optimality, transforms the capacity expression, so that it can be thought of as the average reward received by an appropriately defined stochastic dynamical system with time-invariant state space.

Information Theory Information Theory

Active Privacy-utility Trade-off Against a Hypothesis Testing Adversary

271 - Ecenaz Erdemir , Pier Luigi Dragotti , Deniz Gunduz 2021

We consider a user releasing her data containing some personal information in return of a service. We model users personal information as two correlated random variables, one of them, called the secret variable, is to be kept private, while the other, called the useful variable, is to be disclosed for utility. We consider active sequential data release, where at each time step the user chooses from among a finite set of release mechanisms, each revealing some information about the users personal information, i.e., the true hypotheses, albeit with different statistics. The user manages data release in an online fashion such that maximum amount of information is revealed about the latent useful variable, while the confidence for the sensitive variable is kept below a predefined level. For the utility, we consider both the probability of correct detection of the useful variable and the mutual information (MI) between the useful variable and released data. We formulate both problems as a Markov decision process (MDP), and numerically solve them by advantage actor-critic (A2C) deep reinforcement learning (RL).

Information Theory Cryptography and Security Machine Learning

Deep Adaptive Design: Amortizing Sequential Bayesian Experimental Design

351 - Adam Foster , Desi R. Ivanova , Ilyas Malik 2021

We introduce Deep Adaptive Design (DAD), a method for amortizing the cost of adaptive Bayesian experimental design that allows experiments to be run in real-time. Traditional sequential Bayesian optimal experimental design approaches require substantial computation at each stage of the experiment. This makes them unsuitable for most real-world applications, where decisions must typically be made quickly. DAD addresses this restriction by learning an amortized design network upfront and then using this to rapidly run (multiple) adaptive experiments at deployment time. This network represents a design policy which takes as input the data from previous steps, and outputs the next design using a single forward pass; these design decisions can be made in milliseconds during the live experiment. To train the network, we introduce contrastive information bounds that are suitable objectives for the sequential setting, and propose a customized network architecture that exploits key symmetries. We demonstrate that DAD successfully amortizes the process of experimental design, outperforming alternative strategies on a number of problems.

Machine Learning Artificial Intelligence Machine Learning

Optimal Adaptive Strategies for Sequential Quantum Hypothesis Testing

75 - Yonglong Li , Vincent Y. F. Tan , 2021

We consider sequential hypothesis testing between two quantum states using adaptive and non-adaptive strategies. In this setting, samples of an unknown state are requested sequentially and a decision to either continue or to accept one of the two hypotheses is made after each test. Under the constraint that the number of samples is bounded, either in expectation or with high probability, we exhibit adaptive strategies that minimize both types of misidentification errors. Namely, we show that these errors decrease exponentially (in the stopping time) with decay rates given by the measured relative entropies between the two states. Moreover, if we allow joint measurements on multiple samples, the rates are increased to the respective quantum relative entropies. We also fully characterize the achievable error exponents for non-adaptive strategies and provide numerical evidence showing that adaptive measurements are necessary to achieve our bounds under some additional assumptions.

Quantum Physics Information Theory Mathematical Physics

Distributed Learning for Time-varying Networks: A Scalable Design

102 - Jian Wang , Yourui Huangfu , Rong Li 2021

The wireless network is undergoing a trend from onnection of things to connection of intelligence. With data spread over the communication networks and computing capability enhanced on the devices, distributed learning becomes a hot topic in both industrial and academic communities. Many frameworks, such as federated learning and federated distillation, have been proposed. However, few of them takes good care of obstacles such as the time-varying topology resulted by the characteristics of wireless networks. In this paper, we propose a distributed learning framework based on a scalable deep neural network (DNN) design. By exploiting the permutation equivalence and invariance properties of the learning tasks, the DNNs with different scales for different clients can be built up based on two basic parameter sub-matrices. Further, model aggregation can also be conducted based on these two sub-matrices to improve the learning convergence and performance. Finally, simulation results verify the benefits of the proposed framework by compared with some baselines.

Information Theory Artificial Intelligence Machine Learning

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Policy Design for Active Sequential Hypothesis Testing using Deep Learning

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions