Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Mutual Information of Neural Network Initialisations: Mean Field Approximations

114 0 0.0 ( 0 )

Download Cite

Added by Giuseppe Ughi

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Jared Tanner - Giuseppe Ughi

Information Theory Information Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The ability to train randomly initialised deep neural networks is known to depend strongly on the variance of the weight matrices and biases as well as the choice of nonlinear activation. Here we complement the existing geometric analysis of this phenomenon with an information theoretic alternative. Lower bounds are derived for the mutual information between an input and hidden layer outputs. Using a mean field analysis we are able to provide analytic lower bounds as functions of network weight and bias variances as well as the choice of nonlinear activation. These results show that initialisations known to be optimal from a training point of view are also superior from a mutual information perspective.

rate research

Approximations of Shannon Mutual Information for Discrete Variables with Applications to Neural Population Coding

74 - Wentao Huang , Kechen Zhang 2019

Although Shannon mutual information has been widely used, its effective calculation is often difficult for many practical problems, including those in neural population coding. Asymptotic formulas based on Fisher information sometimes provide accurate approximations to the mutual information but this approach is restricted to continuous variables because the calculation of Fisher information requires derivatives with respect to the encoded variables. In this paper, we consider information-theoretic bounds and approximations of the mutual information based on Kullback--Leibler divergence and R{e}nyi divergence. We propose several information metrics to approximate Shannon mutual information in the context of neural population coding. While our asymptotic formulas all work for discrete variables, one of them has consistent performance and high accuracy regardless of whether the encoded variables are discrete or continuous. We performed numerical simulations and confirmed that our approximation formulas were highly accurate for approximating the mutual information between the stimuli and the responses of a large neural population. These approximation formulas may potentially bring convenience to the applications of information theory to many practical and theoretical problems.

Information Theory Information Theory Statistics Theory

Mutual Information Approximation

175 - Chongjun Ouyang , Sheng Wu , 2019

To provide an efficient approach to characterize the input-output mutual information (MI) under additive white Gaussian noise (AWGN) channel, this short report fits the curves of exact MI under multilevel quadrature amplitude modulation (M-QAM) signal inputs via multi-exponential decay curve fitting (M-EDCF). Even though the definition expression for instanious MI versus Signal to Noise Ratio (SNR) is complex and the containing integral is intractable, our new developed fitting formula holds a neat and compact form, which possesses high precision as well as low complexity. Generally speaking, this approximation formula of MI can promote the research of performance analysis in practical communication system under discrete inputs.

Information Theory Information Theory

Mutual Information Bounds via Adjacency Events

454 - Yanjun Han , Or Ordentlich , Ofer Shayevitz 2015

The mutual information between two jointly distributed random variables $X$ and $Y$ is a functional of the joint distribution $P_{XY},$ which is sometimes difficult to handle or estimate. A coarser description of the statistical behavior of $(X,Y)$ is given by the marginal distributions $P_X, P_Y$ and the adjacency relation induced by the joint distribution, where $x$ and $y$ are adjacent if $P(x,y)>0$. We derive a lower bound on the mutual information in terms of these entities. The bound is obtained by viewing the channel from $X$ to $Y$ as a probability distribution on a set of possible actions, where an action determines the output for any possible input, and is independently drawn. We also provide an alternative proof based on convex optimization, that yields a generally tighter bound. Finally, we derive an upper bound on the mutual information in terms of adjacency events between the action and the pair $(X,Y)$, where in this case an action $a$ and a pair $(x,y)$ are adjacent if $y=a(x)$. As an example, we apply our bounds to the binary deletion channel and show that for the special case of an i.i.d. input distribution and a range of deletion probabilities, our lower and upper bounds both outperform the best known bounds for the mutual information.

Information Theory Information Theory

Neural Mutual Information Estimation for Channel Coding: State-of-the-Art Estimators, Analysis, and Performance Comparison

76 - Rick Fritschek , Rafael F. Schaefer , Gerhard Wunder 2020

Deep learning based physical layer design, i.e., using dense neural networks as encoders and decoders, has received considerable interest recently. However, while such an approach is naturally training data-driven, actions of the wireless channel are mimicked using standard channel models, which only partially reflect the physical ground truth. Very recently, neural network based mutual information (MI) estimators have been proposed that directly extract channel actions from the input-output measurements and feed these outputs into the channel encoder. This is a promising direction as such a new design paradigm is fully adaptive and training data-based. This paper implements further recent improvements of such MI estimators, analyzes theoretically their suitability for the channel coding problem, and compares their performance. To this end, a new MI estimator using a emph{``reverse Jensen} approach is proposed.

Information Theory Information Theory

A Mutual Information Approach to Calculating Nonlinearity

105 - Reginald D. Smith 2015

A new method to measure nonlinear dependence between two variables is described using mutual information to analyze the separate linear and nonlinear components of dependence. This technique, which gives an exact value for the proportion of linear dependence, is then compared with another common test for linearity, the Brock, Dechert and Scheinkman (BDS) test.

Information Theory Information Theory

comments

Fetching comments

Mustansiriyah University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Mutual Information of Neural Network Initialisations: Mean Field Approximations

Ask ChatGPT about the research

No Arabic abstract

Read More