Do you want to publish a course? Click here

$(f,Gamma)$-Divergences: Interpolating between $f$-Divergences and Integral Probability Metrics

67   0   0.0 ( 0 )
 Added by Jeremiah Birrell
 Publication date 2020
and research's language is English




Ask ChatGPT about the research

We develop a rigorous and general framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs), such as the $1$-Wasserstein distance. We prove under which assumptions these divergences, hereafter referred to as $(f,Gamma)$-divergences, provide a notion of `distance between probability measures and show that they can be expressed as a two-stage mass-redistribution/mass-transport process. The $(f,Gamma)$-divergences inherit features from IPMs, such as the ability to compare distributions which are not absolutely continuous, as well as from $f$-divergences, namely the strict concavity of their variational representations and the ability to control heavy-tailed distributions for particular choices of $f$. When combined, these features establish a divergence with improved properties for estimation, statistical learning, and uncertainty quantification applications. Using statistical learning as an example, we demonstrate their advantage in training generative adversarial networks (GANs) for heavy-tailed, not-absolutely continuous sample distributions. We also show improved performance and stability over gradient-penalized Wasserstein GAN in image generation.



rate research

Read More

The estimation of an f-divergence between two probability distributions based on samples is a fundamental problem in statistics and machine learning. Most works study this problem under very weak assumptions, in which case it is provably hard. We consider the case of stronger structural assumptions that are commonly satisfied in modern machine learning, including representation learning and generative modelling with autoencoder architectures. Under these assumptions we propose and study an estimator that can be easily implemented, works well in high dimensions, and enjoys faster rates of convergence. We verify the behavior of our estimator empirically in both synthetic and real-data experiments, and discuss its direct implications for total correlation, entropy, and mutual information estimation.
274 - Fumio Hiai 2018
As a continuation of the paper [20] on standard $f$-divergences, we make a systematic study of maximal $f$-divergences in general von Neumann algebras. For maximal $f$-divergences, apart from their definition based on Haagerups $L^1$-space, we present the general integral expression and the variational expression in terms of reverse tests. From these definition and expressions we prove important properties of maximal $f$-divergences, for instance, the monotonicity inequality, the joint convexity, the lower semicontinuity, and the martingale convergence. The inequality between the standard and the maximal $f$-divergences is also given.
163 - Fumio Hiai 2018
We make a systematic study of standard $f$-divergences in general von Neumann algebras. An important ingredient of our study is to extend Kosakis variational expression of the relative entropy to an arbitary standard $f$-divergence, from which most of the important properties of standard $f$-divergences follow immediately. In a similar manner we give a comprehensive exposition on the Renyi divergence in von Neumann algebra. Some results on relative hamiltonians formerly studied by Araki and Donald are improved as a by-product.
166 - Igal Sason 2018
This paper is focused on $f$-divergences, consisting of three main contributions. The first one introduces integral representations of a general $f$-divergence by means of the relative information spectrum. The second part provides a new approach for the derivation of $f$-divergence inequalities, and it exemplifies their utility in the setup of Bayesian binary hypothesis testing. The last part of this paper further studies the local behavior of $f$-divergences.
94 - Mark M. Wilde 2021
The quantum relative entropy is a measure of the distinguishability of two quantum states, and it is a unifying concept in quantum information theory: many information measures such as entropy, conditional entropy, mutual information, and entanglement measures can be realized from it. As such, there has been broad interest in generalizing the notion to further understand its most basic properties, one of which is the data processing inequality. The quantum f-divergence of Petz is one generalization of the quantum relative entropy, and it also leads to other relative entropies, such as the Petz--Renyi relative entropies. In this contribution, I introduce the optimized quantum f-divergence as a related generalization of quantum relative entropy. I prove that it satisfies the data processing inequality, and the method of proof relies upon the operator Jensen inequality, similar to Petzs original approach. Interestingly, the sandwiched Renyi relative entropies are particular examples of the optimized f-divergence. Thus, one benefit of this approach is that there is now a single, unified approach for establishing the data processing inequality for both the Petz--Renyi and sandwiched Renyi relative entropies, for the full range of parameters for which it is known to hold.

suggested questions

comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا