New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Optimal distributed testing in high-dimensional Gaussian models

149 0 0.0 ( 0 )

Download Cite

Added by Lasse Vuursteen

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Botond Szabo - Lasse Vuursteen - Harry van Zanten

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this paper study the problem of signal detection in Gaussian noise in a distributed setting. We derive a lower bound on the size that the signal needs to have in order to be detectable. Moreover, we exhibit optimal distributed testing strategies that attain the lower bound.

rate research

Second-Order Asymptotically Optimal Universal Outlier Hypothesis Testing

71 - Lin Zhou , Yun Wei , Alfred Hero 2020

We revisit the universal outlier hypothesis testing (Li emph{et al.}, TIT 2014) and derive fundamental limits for the optimal test. In outlying hypothesis testing, one is given multiple observed sequences, where most sequences are generated i.i.d. from a nominal distribution. The task is to discern the set of outlying sequences that are generated according to anomalous distributions. The nominal and anomalous distributions are emph{unknown}. We study the tradeoff among the probabilities of misclassification error, false alarm and false reject for tests that satisfy weak conditions on the rate of decrease of these error probabilities as a function of sequence length. Specifically, we propose a threshold-based universal test that ensures exponential decay of misclassification error and false alarm probabilities. We study two constraints on the false reject probabilities, one is that it be a non-vanishing constant and the other is that it have an exponential decay rate. For both cases, we characterize bounds on the false reject probability, as a function of the threshold, for each pair of nominal and anomalous distributions and demonstrate the optimality of our test in the generalized Neyman-Pearson sense. We first consider the case of at most one outlier and then generalize our results to the case of multiple outliers where the number of outliers is unknown and each outlier can follow a different anomalous distribution.

Information Theory Information Theory Statistics Theory

On the Limits of Sequential Testing in High Dimensions

224 - Matthew Malloy , Robert Nowak 2011

This paper presents results pertaining to sequential methods for support recovery of sparse signals in noise. Specifically, we show that any sequential measurement procedure fails provided the average number of measurements per dimension grows slower then log s / D(f0||f1) where s is the level of sparsity, and D(f0||f1) the Kullback-Leibler divergence between the underlying distributions. For comparison, we show any non-sequential procedure fails provided the number of measurements grows at a rate less than log n / D(f1||f0), where n is the total dimension of the problem. Lastly, we show that a simple procedure termed sequential thresholding guarantees exact support recovery provided the average number of measurements per dimension grows faster than (log s + log log n) / D(f0||f1), a mere additive factor more than the lower bound.

Information Theory Information Theory Statistics Theory

On the Reliability Function of Distributed Hypothesis Testing Under Optimal Detection

212 - Nir Weinberger , Yuval Kochman 2018

The distributed hypothesis testing problem with full side-information is studied. The trade-off (reliability function) between the two types of error exponents under limited rate is studied in the following way. First, the problem is reduced to the problem of determining the reliability function of channel codes designed for detection (in analogy to a similar result which connects the reliability function of distributed lossless compression and ordinary channel codes). Second, a single-letter random-coding bound based on a hierarchical ensemble, as well as a single-letter expurgated bound, are derived for the reliability of channel-detection codes. Both bounds are derived for a system which employs the optimal detection rule. We conjecture that the resulting random-coding bound is ensemble-tight, and consequently optimal within the class of quantization-and-binning schemes.

Information Theory Information Theory

Structural Properties of Optimal Test Channels for Gaussian Multivariate Partially Observable Distributed Sources

70 - Michail Gkagkos , Evagoras Stylianou , Charalambos D. Charalambous 2021

In this paper, we analyze the operational information rate distortion function (RDF) ${R}_{S;Z|Y}(Delta_X)$, introduced by Draper and Wornell, for a triple of jointly independent and identically distributed, multivariate Gaussian random variables (RVs), $(X^n, S^n, Y^n)= {(X_{t}, S_t, Y_{t}): t=1,2, ldots,n}$, where $X^n$ is the source, $S^n$ is a measurement of $X^n$, available to the encoder, $Y^n$ is side information available to the decoder only, $Z^n$ is the auxiliary RV available to the decoder, with respect to the square-error fidelity, between the source $X^n$ and its reconstruction $widehat{X}^n$. We also analyze the RDF ${R}_{S;widehat{X}|Y}(Delta_X)$ that corresponds to the above set up, when side information $Y^n$ is available to the encoder and decoder. The main results include, (1) Structural properties of test channel realizations that induce distributions, which achieve the two RDFs, (2) Water-filling solutions of the two RDFs, based on parallel channel realizations of test channels, (3) A proof of equality ${R}_{S;Z|Y}(Delta_X) = {R}_{S;widehat{X}|Y}(Delta_X)$, i.e., side information $Y^n$ at both the encoder and decoder does not incur smaller compression, and (4) Relations to other RDFs, as degenerate cases, which show past literature, contain oversights related to the optimal test channel realizations and value of the RDF ${R}_{S;Z|Y}(Delta_X)$.

Information Theory Information Theory

New approach to Bayesian high-dimensional linear regression

177 - Shirin Jalali , Arian Maleki 2016

Consider the problem of estimating parameters $X^n in mathbb{R}^n $, generated by a stationary process, from $m$ response variables $Y^m = AX^n+Z^m$, under the assumption that the distribution of $X^n$ is known. This is the most general version of the Bayesian linear regression problem. The lack of computationally feasible algorithms that can employ generic prior distributions and provide a good estimate of $X^n$ has limited the set of distributions researchers use to model the data. In this paper, a new scheme called Q-MAP is proposed. The new method has the following properties: (i) It has similarities to the popular MAP estimation under the noiseless setting. (ii) In the noiseless setting, it achieves the asymptotically optimal performance when $X^n$ has independent and identically distributed components. (iii) It scales favorably with the dimensions of the problem and therefore is applicable to high-dimensional setups. (iv) The solution of the Q-MAP optimization can be found via a proposed iterative algorithm which is provably robust to the error (noise) in the response variables.

Information Theory Information Theory Statistics Theory

comments

Fetching comments

AlHawash Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Optimal distributed testing in high-dimensional Gaussian models

Ask ChatGPT about the research

No Arabic abstract

In this paper study the problem of signal detection in Gaussian noise in a distributed setting. We derive a lower bound on the size that the signal needs to have in order to be detectable. Moreover, we exhibit optimal distributed testing strategies that attain the lower bound.

Read More