$(f,Gamma)$-Divergences: Interpolating between $f$-Divergences and Integral Probability Metrics

67 0 0.0 ( 0 )

Download Cite

Added by Jeremiah Birrell

Publication date 2020

fields Mathematical Statistics Informatics Engineering

and research's language is English

Authors Jeremiah Birrell - Paul Dupuis - Markos A. Katsoulakis

Machine Learning Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We develop a rigorous and general framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs), such as the $1$-Wasserstein distance. We prove under which assumptions these divergences, hereafter referred to as $(f,Gamma)$-divergences, provide a notion of `distance between probability measures and show that they can be expressed as a two-stage mass-redistribution/mass-transport process. The $(f,Gamma)$-divergences inherit features from IPMs, such as the ability to compare distributions which are not absolutely continuous, as well as from $f$-divergences, namely the strict concavity of their variational representations and the ability to control heavy-tailed distributions for particular choices of $f$. When combined, these features establish a divergence with improved properties for estimation, statistical learning, and uncertainty quantification applications. Using statistical learning as an example, we demonstrate their advantage in training generative adversarial networks (GANs) for heavy-tailed, not-absolutely continuous sample distributions. We also show improved performance and stability over gradient-penalized Wasserstein GAN in image generation.

rate research

Practical and Consistent Estimation of f-Divergences

90 - Paul K. Rubenstein , Olivier Bousquet , Josip Djolonga 2019

The estimation of an f-divergence between two probability distributions based on samples is a fundamental problem in statistics and machine learning. Most works study this problem under very weak assumptions, in which case it is provably hard. We consider the case of stronger structural assumptions that are commonly satisfied in modern machine learning, including representation learning and generative modelling with autoencoder architectures. Under these assumptions we propose and study an estimator that can be easily implemented, works well in high dimensions, and enjoys faster rates of convergence. We verify the behavior of our estimator empirically in both synthetic and real-data experiments, and discuss its direct implications for total correlation, entropy, and mutual information estimation.

Machine Learning Information Theory Machine Learning

Quantum $f$-divergences in von Neumann algebras II. Maximal $f$-divergences

274 - Fumio Hiai 2018

As a continuation of the paper [20] on standard $f$-divergences, we make a systematic study of maximal $f$-divergences in general von Neumann algebras. For maximal $f$-divergences, apart from their definition based on Haagerups $L^1$-space, we present the general integral expression and the variational expression in terms of reverse tests. From these definition and expressions we prove important properties of maximal $f$-divergences, for instance, the monotonicity inequality, the joint convexity, the lower semicontinuity, and the martingale convergence. The inequality between the standard and the maximal $f$-divergences is also given.

Mathematical Physics Mathematical Physics Operator Algebras

Quantum $f$-divergences in von Neumann algebras I. Standard $f$-divergences

163 - Fumio Hiai 2018

We make a systematic study of standard $f$-divergences in general von Neumann algebras. An important ingredient of our study is to extend Kosakis variational expression of the relative entropy to an arbitary standard $f$-divergence, from which most of the important properties of standard $f$-divergences follow immediately. In a similar manner we give a comprehensive exposition on the Renyi divergence in von Neumann algebra. Some results on relative hamiltonians formerly studied by Araki and Donald are improved as a by-product.

Mathematical Physics Mathematical Physics Operator Algebras

On $f$-Divergences: Integral Representations, Local Behavior, and Inequalities

166 - Igal Sason 2018

This paper is focused on $f$-divergences, consisting of three main contributions. The first one introduces integral representations of a general $f$-divergence by means of the relative information spectrum. The second part provides a new approach for the derivation of $f$-divergence inequalities, and it exemplifies their utility in the setup of Bayesian binary hypothesis testing. The last part of this paper further studies the local behavior of $f$-divergences.

Information Theory Information Theory Probability

Optimized quantum f-divergences

94 - Mark M. Wilde 2021

The quantum relative entropy is a measure of the distinguishability of two quantum states, and it is a unifying concept in quantum information theory: many information measures such as entropy, conditional entropy, mutual information, and entanglement measures can be realized from it. As such, there has been broad interest in generalizing the notion to further understand its most basic properties, one of which is the data processing inequality. The quantum f-divergence of Petz is one generalization of the quantum relative entropy, and it also leads to other relative entropies, such as the Petz--Renyi relative entropies. In this contribution, I introduce the optimized quantum f-divergence as a related generalization of quantum relative entropy. I prove that it satisfies the data processing inequality, and the method of proof relies upon the operator Jensen inequality, similar to Petzs original approach. Interestingly, the sandwiched Renyi relative entropies are particular examples of the optimized f-divergence. Thus, one benefit of this approach is that there is now a single, unified approach for establishing the data processing inequality for both the Petz--Renyi and sandwiched Renyi relative entropies, for the full range of parameters for which it is known to hold.

Quantum Physics Information Theory Mathematical Physics

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

$(f,Gamma)$-Divergences: Interpolating between $f$-Divergences and Integral Probability Metrics

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions