Lower Bounds for the Minimum Mean-Square Error via Neural Network-based Estimation

91 0 0.0 ( 0 )

Download Cite

Added by Mario Diaz

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Mario Diaz - Peter Kairouz - Lalitha Sankar

Information Theory Information Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The minimum mean-square error (MMSE) achievable by optimal estimation of a random variable $Yinmathbb{R}$ given another random variable $Xinmathbb{R}^{d}$ is of much interest in a variety of statistical contexts. In this paper we propose two estimators for the MMSE, one based on a two-layer neural network and the other on a special three-layer neural network. We derive lower bounds for the MMSE based on the proposed estimators and the Barron constant of an appropriate function of the conditional expectation of $Y$ given $X$. Furthermore, we derive a general upper bound for the Barron constant that, when $Xinmathbb{R}$ is post-processed by the additive Gaussian mechanism, produces order optimal estimates in the large noise regime.

rate research

Lower Bounds on Information Requirements for Causal Network Inference

120 - Xiaohan Kang , Bruce Hajek 2021

Recovery of the causal structure of dynamic networks from noisy measurements has long been a problem of intense interest across many areas of science and engineering. Many algorithms have been proposed, but there is no work that compares the performance of the algorithms to converse bounds in a non-asymptotic setting. As a step to address this problem, this paper gives lower bounds on the error probability for causal network support recovery in a linear Gaussian setting. The bounds are based on the use of the Bhattacharyya coefficient for binary hypothesis testing problems with mixture probability distributions. Comparison of the bounds and the performance achieved by two representative recovery algorithms are given for sparse random networks based on the ErdH{o}s-Renyi model.

Information Theory Information Theory

On the Minimum Mean $p$-th Error in Gaussian Noise Channels and its Applications

266 - Alex Dytso , Ronit Bustin , Daniela Tuninetti 2016

The problem of estimating an arbitrary random vector from its observation corrupted by additive white Gaussian noise, where the cost function is taken to be the Minimum Mean $p$-th Error (MMPE), is considered. The classical Minimum Mean Square Error (MMSE) is a special case of the MMPE. Several bounds, properties and applications of the MMPE are derived and discussed. The optimal MMPE estimator is found for Gaussian and binary input distributions. Properties of the MMPE as a function of the input distribution, SNR and order $p$ are derived. In particular, it is shown that the MMPE is a continuous function of $p$ and SNR. These results are possible in view of interpolation and change of measure bounds on the MMPE. The `Single-Crossing-Point Property (SCPP) that bounds the MMSE for all SNR values {it above} a certain value, at which the MMSE is known, together with the I-MMSE relationship is a powerful tool in deriving converse proofs in information theory. By studying the notion of conditional MMPE, a unifying proof (i.e., for any $p$) of the SCPP is shown. A complementary bound to the SCPP is then shown, which bounds the MMPE for all SNR values {it below} a certain value, at which the MMPE is known. As a first application of the MMPE, a bound on the conditional differential entropy in terms of the MMPE is provided, which then yields a generalization of the Ozarow-Wyner lower bound on the mutual information achieved by a discrete input on a Gaussian noise channel. As a second application, the MMPE is shown to improve on previous characterizations of the phase transition phenomenon that manifests, in the limit as the length of the capacity achieving code goes to infinity, as a discontinuity of the MMSE as a function of SNR. As a final application, the MMPE is used to show bounds on the second derivative of mutual information, that tighten previously known bounds.

Information Theory Information Theory

Placement Delivery Array Design via Attention-Based Deep Neural Network

275 - Zhengming Zhang , Meng Hua , Chunguo Li 2018

A decentralized coded caching scheme has been proposed by Maddah-Ali and Niesen, and has been shown to alleviate the load of networks. Recently, placement delivery array (PDA) was proposed to characterize the coded caching scheme. In this paper, a neural architecture is first proposed to learn the construction of PDAs. Our model solves the problem of variable size PDAs using mechanism of neural attention and reinforcement learning. It differs from the previous attempts in that, instead of using combined optimization algorithms to get PDAs, it uses sequence-to-sequence model to learn construct PDAs. Numerical results are given to demonstrate that the proposed method can effectively implement coded caching. We also show that the complexity of our method to construct PDAs is low.

Information Theory Information Theory

Improved batch code lower bounds

98 - Ray Li , Mary Wootters 2021

Batch codes are a useful notion of locality for error correcting codes, originally introduced in the context of distributed storage and cryptography. Many constructions of batch codes have been given, but few lower bound (limitation) results are known, leaving gaps between the best known constructions and best known lower bounds. Towards determining the optimal redundancy of batch codes, we prove a new lower bound on the redundancy of batch codes. Specifically, we study (primitive, multiset) linear batch codes that systematically encode $n$ information symbols into $N$ codeword symbols, with the requirement that any multiset of $k$ symbol requests can be obtained in disjoint ways. We show that such batch codes need $Omega(sqrt{Nk})$ symbols of redundancy, improving on the previous best lower bounds of $Omega(sqrt{N}+k)$ at all $k=n^varepsilon$ with $varepsilonin(0,1)$. Our proof follows from analyzing the dimension of the order-$O(k)$ tensor of the batch codes dual code.

Information Theory Information Theory

Mutual Information of Neural Network Initialisations: Mean Field Approximations

113 - Jared Tanner , Giuseppe Ughi 2021

The ability to train randomly initialised deep neural networks is known to depend strongly on the variance of the weight matrices and biases as well as the choice of nonlinear activation. Here we complement the existing geometric analysis of this phenomenon with an information theoretic alternative. Lower bounds are derived for the mutual information between an input and hidden layer outputs. Using a mean field analysis we are able to provide analytic lower bounds as functions of network weight and bias variances as well as the choice of nonlinear activation. These results show that initialisations known to be optimal from a training point of view are also superior from a mutual information perspective.

Information Theory Information Theory

comments

Fetching comments

Arab Academy for Science and Technology and Maritime Transport

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Lower Bounds for the Minimum Mean-Square Error via Neural Network-based Estimation

Ask ChatGPT about the research

No Arabic abstract

Read More