ترغب بنشر مسار تعليمي؟ اضغط هنا

Probability Mass Exclusions and the Directed Components of Pointwise Mutual Information

70   0   0.0 ( 0 )
 نشر من قبل Conor Finn
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper examines how an event from one random variable provides pointwise mutual information about an event from another variable via probability mass exclusions. We start by introducing probability mass diagrams, which provide a visual representation of how a prior distribution is transformed to a posterior distribution through exclusions. With the aid of these diagrams, we identify two distinct types of probability mass exclusions---namely informative and misinformative exclusions. Then, motivated by Fanos derivation of the pointwise mutual information, we propose four postulates which aim to decompose the pointwise mutual information into two separate informational components: a non-negative term associated with the informative exclusion and a non-positive term associated with the misinformative exclusions. This yields a novel derivation of a familiar decomposition of the pointwise mutual information into entropic components. We conclude by discussing the relevance of considering information in terms of probability mass exclusions to the ongoing effort to decompose multivariate information.



قيم البحث

اقرأ أيضاً

175 - Chongjun Ouyang , Sheng Wu , 2019
To provide an efficient approach to characterize the input-output mutual information (MI) under additive white Gaussian noise (AWGN) channel, this short report fits the curves of exact MI under multilevel quadrature amplitude modulation (M-QAM) signa l inputs via multi-exponential decay curve fitting (M-EDCF). Even though the definition expression for instanious MI versus Signal to Noise Ratio (SNR) is complex and the containing integral is intractable, our new developed fitting formula holds a neat and compact form, which possesses high precision as well as low complexity. Generally speaking, this approximation formula of MI can promote the research of performance analysis in practical communication system under discrete inputs.
What are the distinct ways in which a set of predictor variables can provide information about a target variable? When does a variable provide unique information, when do variables share redundant information, and when do variables combine synergisti cally to provide complementary information? The redundancy lattice from the partial information decomposition of Williams and Beer provided a promising glimpse at the answer to these questions. However, this structure was constructed using a much criticised measure of redundant information, and despite sustained research, no completely satisfactory replacement measure has been proposed. In this paper, we take a different approach, applying the axiomatic derivation of the redundancy lattice to a single realisation from a set of discrete variables. To overcome the difficulty associated with signed pointwise mutual information, we apply this decomposition separately to the unsigned entropic components of pointwise mutual information which we refer to as the specificity and ambiguity. This yields a separate redundancy lattice for each component. Then based upon an operational interpretation of redundancy, we define measures of redundant specificity and ambiguity enabling us to evaluate the partial information atoms in each lattice. These atoms can be recombined to yield the sought-after multivariate information decomposition. We apply this framework to canonical examples from the literature and discuss the results and the various properties of the decomposition. In particular, the pointwise decomposition using specificity and ambiguity satisfies a chain rule over target variables, which provides new insights into the so-called two-bit-copy example.
Estimators for mutual information are typically biased. However, in the case of the Kozachenko-Leonenko estimator for metric spaces, a type of nearest neighbour estimator, it is possible to calculate the bias explicitly.
The mutual information between two jointly distributed random variables $X$ and $Y$ is a functional of the joint distribution $P_{XY},$ which is sometimes difficult to handle or estimate. A coarser description of the statistical behavior of $(X,Y)$ i s given by the marginal distributions $P_X, P_Y$ and the adjacency relation induced by the joint distribution, where $x$ and $y$ are adjacent if $P(x,y)>0$. We derive a lower bound on the mutual information in terms of these entities. The bound is obtained by viewing the channel from $X$ to $Y$ as a probability distribution on a set of possible actions, where an action determines the output for any possible input, and is independently drawn. We also provide an alternative proof based on convex optimization, that yields a generally tighter bound. Finally, we derive an upper bound on the mutual information in terms of adjacency events between the action and the pair $(X,Y)$, where in this case an action $a$ and a pair $(x,y)$ are adjacent if $y=a(x)$. As an example, we apply our bounds to the binary deletion channel and show that for the special case of an i.i.d. input distribution and a range of deletion probabilities, our lower and upper bounds both outperform the best known bounds for the mutual information.
146 - Rami Atar , Neri Merhav 2014
A well-known technique in estimating probabilities of rare events in general and in information theory in particular (used, e.g., in the sphere-packing bound), is that of finding a reference probability measure under which the event of interest has p robability of order one and estimating the probability in question by means of the Kullback-Leibler divergence. A method has recently been proposed in [2], that can be viewed as an extension of this idea in which the probability under the reference measure may itself be decaying exponentially, and the Renyi divergence is used instead. The purpose of this paper is to demonstrate the usefulness of this approach in various information-theoretic settings. For the problem of channel coding, we provide a general methodology for obtaining matched, mismatched and robust error exponent bounds, as well as new results in a variety of particular channel models. Other applications we address include rate-distortion coding and the problem of guessing.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا