No Arabic abstract
This paper studies forward and reverse projections for the R{e}nyi divergence of order $alpha in (0, infty)$ on $alpha$-convex sets. The forward projection on such a set is motivated by some works of Tsallis {em et al.} in statistical physics, and the reverse projection is motivated by robust statistics. In a recent work, van Erven and Harremoes proved a Pythagorean inequality for R{e}nyi divergences on $alpha$-convex sets under the assumption that the forward projection exists. Continuing this study, a sufficient condition for the existence of forward projection is proved for probability measures on a general alphabet. For $alpha in (1, infty)$, the proof relies on a new Apollonius theorem for the Hellinger divergence, and for $alpha in (0,1)$, the proof relies on the Banach-Alaoglu theorem from functional analysis. Further projection results are then obtained in the finite alphabet setting. These include a projection theorem on a specific $alpha$-convex set, which is termed an {em $alpha$-linear family}, generalizing a result by Csiszar for $alpha eq 1$. The solution to this problem yields a parametric family of probability measures which turns out to be an extension of the exponential family, and it is termed an {em $alpha$-exponential family}. An orthogonality relationship between the $alpha$-exponential and $alpha$-linear families is established, and it is used to turn the reverse projection on an $alpha$-exponential family into a forward projection on a $alpha$-linear family. This paper also proves a convergence result of an iterative procedure used to calculate the forward projection on an intersection of a finite number of $alpha$-linear families.
This paper starts by considering the minimization of the Renyi divergence subject to a constraint on the total variation distance. Based on the solution of this optimization problem, the exact locus of the points $bigl( D(Q|P_1), D(Q|P_2) bigr)$ is determined when $P_1, P_2, Q$ are arbitrary probability measures which are mutually absolutely continuous, and the total variation distance between $P_1$ and $P_2$ is not below a given value. It is further shown that all the points of this convex region are attained by probability measures which are defined on a binary alphabet. This characterization yields a geometric interpretation of the minimal Chernoff information subject to a constraint on the total variation distance. This paper also derives an exponential upper bound on the performance of binary linear block codes (or code ensembles) under maximum-likelihood decoding. Its derivation relies on the Gallager bounding technique, and it reproduces the Shulman-Feder bound as a special case. The bound is expressed in terms of the Renyi divergence from the normalized distance spectrum of the code (or the average distance spectrum of the ensemble) to the binomially distributed distance spectrum of the capacity-achieving ensemble of random block codes. This exponential bound provides a quantitative measure of the degradation in performance of binary linear block codes (or code ensembles) as a function of the deviation of their distance spectra from the binomial distribution. An efficient use of this bound is considered.
Renyi divergence is related to Renyi entropy much like Kullback-Leibler divergence is related to Shannons entropy, and comes up in many settings. It was introduced by Renyi as a measure of information that satisfies almost the same axioms as Kullback-Leibler divergence, and depends on a parameter that is called its order. In particular, the Renyi divergence of order 1 equals the Kullback-Leibler divergence. We review and extend the most important properties of Renyi divergence and Kullback-Leibler divergence, including convexity, continuity, limits of $sigma$-algebras and the relation of the special order 0 to the Gaussian dichotomy and contiguity. We also show how to generalize the Pythagorean inequality to orders different from 1, and we extend the known equivalence between channel capacity and minimax redundancy to continuous channel inputs (for all orders) and present several other minimax results.
A new upper bound on the relative entropy is derived as a function of the total variation distance for probability measures defined on a common finite alphabet. The bound improves a previously reported bound by Csiszar and Talata. It is further extended to an upper bound on the Renyi divergence of an arbitrary non-negative order (including $infty$) as a function of the total variation distance.
For gambling on horses, a one-parameter family of utility functions is proposed, which contains Kellys logarithmic criterion and the expected-return criterion as special cases. The strategies that maximize the utility function are derived, and the connection to the Renyi divergence is shown. Optimal strategies are also derived when the gambler has some side information; this setting leads to a novel conditional Renyi divergence.
In part I of this two-part work, certain minimization problems based on a parametric family of relative entropies (denoted $mathscr{I}_{alpha}$) were studied. Such minimizers were called forward $mathscr{I}_{alpha}$-projections. Here, a complementary class of minimization problems leading to the so-called reverse $mathscr{I}_{alpha}$-projections are studied. Reverse $mathscr{I}_{alpha}$-projections, particularly on log-convex or power-law families, are of interest in robust estimation problems ($alpha >1$) and in constrained compression settings ($alpha <1$). Orthogonality of the power-law family with an associated linear family is first established and is then exploited to turn a reverse $mathscr{I}_{alpha}$-projection into a forward $mathscr{I}_{alpha}$-projection. The transformed problem is a simpler quasiconvex minimization subject to linear constraints.