Do you want to publish a course? Click here

Ricci curvature for parametric statistics via optimal transport

73   0   0.0 ( 0 )
 Added by Wuchen Li
 Publication date 2018
and research's language is English




Ask ChatGPT about the research

We elaborate the notion of a Ricci curvature lower bound for parametrized statistical models. Following the seminal ideas of Lott-Strum-Villani, we define this notion based on the geodesic convexity of the Kullback-Leibler divergence in a Wasserstein statistical manifold, that is, a manifold of probability distributions endowed with a Wasserstein metric tensor structure. Within these definitions, the Ricci curvature is related to both, information geometry and Wasserstein geometry. These definitions allow us to formulate bounds on the convergence rate of Wasserstein gradient flows and information functional inequalities in parameter space. We discuss examples of Ricci curvature lower bounds and convergence rates in exponential family models.



rate research

Read More

104 - Wuchen Li , Guido Montufar 2018
We study a natural Wasserstein gradient flow on manifolds of probability distributions with discrete sample spaces. We derive the Riemannian structure for the probability simplex from the dynamical formulation of the Wasserstein distance on a weighted graph. We pull back the geometric structure to the parameter space of any given probability model, which allows us to define a natural gradient flow there. In contrast to the natural Fisher-Rao gradient, the natural Wasserstein gradient incorporates a ground metric on sample space. We illustrate the analysis of elementary exponential family examples and demonstrate an application of the Wasserstein natural gradient to maximum likelihood estimation.
100 - Lei Yu 2019
In this paper, we consider Strassens version of optimal transport (OT) problem. That is, we minimize the excess-cost probability (i.e., the probability that the cost is larger than a given value) over all couplings of two given distributions. We derive large deviation, moderate deviation, and central limit theorems for this problem. Our proof is based on Strassens dual formulation of the OT problem, Sanovs theorem on the large deviation principle (LDP) of empirical measures, as well as the moderate deviation principle (MDP) and central limit theorems (CLT) of empirical measures. In order to apply the LDP, MDP, and CLT to Strassens OT problem, two nested optimal transport formulas for Strassens OT problem are derived. Based on these nested formulas and using a splitting technique, we carefully design asymptotically optimal solutions to Strassens OT problem and its dual formulation.
68 - Vitali Kapovitch 2004
We give a proof of the fact that the upper and the lower sectional curvature bounds of a complete manifold vary at a bounded rate under the Ricci flow.
This paper studies the optimal rate of estimation in a finite Gaussian location mixture model in high dimensions without separation conditions. We assume that the number of components $k$ is bounded and that the centers lie in a ball of bounded radius, while allowing the dimension $d$ to be as large as the sample size $n$. Extending the one-dimensional result of Heinrich and Kahn cite{HK2015}, we show that the minimax rate of estimating the mixing distribution in Wasserstein distance is $Theta((d/n)^{1/4} + n^{-1/(4k-2)})$, achieved by an estimator computable in time $O(nd^2+n^{5/4})$. Furthermore, we show that the mixture density can be estimated at the optimal parametric rate $Theta(sqrt{d/n})$ in Hellinger distance and provide a computationally efficient algorithm to achieve this rate in the special case of $k=2$. Both the theoretical and methodological development rely on a careful application of the method of moments. Central to our results is the observation that the information geometry of finite Gaussian mixtures is characterized by the moment tensors of the mixing distribution, whose low-rank structure can be exploited to obtain a sharp local entropy bound.
While Kolmogorov complexity is the accepted absolute measure of information content of an individual finite object, a similarly absolute notion is needed for the relation between an individual data sample and an individual model summarizing the information in the data, for example, a finite set (or probability distribution) where the data sample typically came from. The statistical theory based on such relations between individual objects can be called algorithmic statistics, in contrast to classical statistical theory that deals with relations between probabilistic ensembles. We develop the algorithmic theory of statistic, sufficient statistic, and minimal sufficient statistic. This theory is based on two-part codes consisting of the code for the statistic (the model summarizing the regularity, the meaningful information, in the data) and the model-to-data code. In contrast to the situation in probabilistic statistical theory, the algorithmic relation of (minimal) sufficiency is an absolute relation between the individual model and the individual data sample. We distinguish implicit and explicit descriptions of the models. We give characterizations of algorithmic (Kolmogorov) minimal sufficient statistic for all data samples for both description modes--in the explicit mode under some constraints. We also strengthen and elaborate earlier results on the ``Kolmogorov structure function and ``absolutely non-stochastic objects--those rare objects for which the simplest models that summarize their relevant information (minimal sufficient statistics) are at least as complex as the objects themselves. We demonstrate a close relation between the probabilistic notions and the algorithmic ones.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا