No Arabic abstract
This paper is focused on derivations of data-processing and majorization inequalities for $f$-divergences, and their applications in information theory and statistics. For the accessibility of the material, the main results are first introduced without proofs, followed by exemplifications of the theorems with further related analytical results, interpretations, and information-theoretic applications. One application refers to the performance analysis of list decoding with either fixed or variable list sizes; some earlier bounds on the list decoding error probability are reproduced in a unified way, and new bounds are obtained and exemplified numerically. Another application is related to a study of the quality of approximating a probability mass function, induced by the leaves of a Tunstall tree, by an equiprobable distribution. The compression rates of finite-length Tunstall codes are further analyzed for asserting their closeness to the Shannon entropy of a memoryless and stationary discrete source. Almost all the analysis is relegated to the appendices, which form a major part of this manuscript.
This work provides data-processing and majorization inequalities for $f$-divergences, and it considers some of their applications to coding problems. This work also provides tight bounds on the R{e}nyi entropy of a function of a discrete random variable with a finite number of possible values, where the considered function is not one-to-one, and their derivation is based on majorization and the Schur-concavity of the R{e}nyi entropy. One application of the $f$-divergence inequalities refers to the performance analysis of list decoding with either fixed or variable list sizes; some earlier bounds on the list decoding error probability are reproduced in a unified way, and new bounds are obtained and exemplified numerically. Another application is related to a study of the quality of approximating a probability mass function, which is induced by the leaves of a Tunstall tree, by an equiprobable distribution. The compression rates of finite-length Tunstall codes are further analyzed for asserting their closeness to the Shannon entropy of a memoryless and stationary discrete source. In view of the tight bounds for the R{e}nyi entropy and the work by Campbell, non-asymptotic bounds are derived for lossless data compression of discrete memoryless sources.
This paper is focused on $f$-divergences, consisting of three main contributions. The first one introduces integral representations of a general $f$-divergence by means of the relative information spectrum. The second part provides a new approach for the derivation of $f$-divergence inequalities, and it exemplifies their utility in the setup of Bayesian binary hypothesis testing. The last part of this paper further studies the local behavior of $f$-divergences.
This paper provides tight bounds on the Renyi entropy of a function of a discrete random variable with a finite number of possible values, where the considered function is not one-to-one. To that end, a tight lower bound on the Renyi entropy of a discrete random variable with a finite support is derived as a function of the size of the support, and the ratio of the maximal to minimal probability masses. This work was inspired by the recently published paper by Cicalese et al., which is focused on the Shannon entropy, and it strengthens and generalizes the results of that paper to Renyi entropies of arbitrary positive orders. In view of these generalized bounds and the works by Arikan and Campbell, non-asymptotic bounds are derived for guessing moments and lossless data compression of discrete memoryless sources.
This paper develops systematic approaches to obtain $f$-divergence inequalities, dealing with pairs of probability measures defined on arbitrary alphabets. Functional domination is one such approach, where special emphasis is placed on finding the best possible constant upper bounding a ratio of $f$-divergences. Another approach used for the derivation of bounds among $f$-divergences relies on moment inequalities and the logarithmic-convexity property, which results in tight bounds on the relative entropy and Bhattacharyya distance in terms of $chi^2$ divergences. A rich variety of bounds are shown to hold under boundedness assumptions on the relative information. Special attention is devoted to the total variation distance and its relation to the relative information and relative entropy, including reverse Pinsker inequalities, as well as on the $E_gamma$ divergence, which generalizes the total variation distance. Pinskers inequality is extended for this type of $f$-divergence, a result which leads to an inequality linking the relative entropy and relative information spectrum. Integral expressions of the Renyi divergence in terms of the relative information spectrum are derived, leading to bounds on the Renyi divergence in terms of either the variational distance or relative entropy.
During the last two decades, concentration of measure has been a subject of various exciting developments in convex geometry, functional analysis, statistical physics, high-dimensional statistics, probability theory, information theory, communications and coding theory, computer science, and learning theory. One common theme which emerges in these fields is probabilistic stability: complicated, nonlinear functions of a large number of independent or weakly dependent random variables often tend to concentrate sharply around their expected values. Information theory plays a key role in the derivation of concentration inequalities. Indeed, both the entropy method and the approach based on transportation-cost inequalities are two major information-theoretic paths toward proving concentration. This brief survey is based on a recent monograph of the authors in the Foundations and Trends in Communications and Information Theory (online available at http://arxiv.org/pdf/1212.4663v8.pdf), and a tutorial given by the authors at ISIT 2015. It introduces information theorists to three main techniques for deriving concentration inequalities: the martingale method, the entropy method, and the transportation-cost inequalities. Some applications in information theory, communications, and coding theory are used to illustrate the main ideas.