No Arabic abstract
In this paper, we investigate properties of entropy-penalized Wasserstein barycenters introduced by Bigot, Cazelles and Papadakis (2019) as a regularization of Wasserstein barycenters first presented by Agueh and Carlier (2011). After characterizing these barycenters in terms of a system of Monge-Amp`ere equations, we prove some global moment and Sobolev bounds as well as higher regularity properties. We finally establish a central limit theorem for entropic-Wasserstein barycenters.
In this thesis, we consider the Wasserstein barycenter problem of discrete probability measures from computational and statistical sides in two scenarios: (I) the measures are given and we need to compute their Wasserstein barycenter, and (ii) the measures are generated from a probability distribution and we need to calculate the population barycenter of the distribution defined by the notion of Frechet mean. The statistical focus is estimating the sample size of measures necessary to calculate an approximation for Frechet mean (barycenter) of a probability distribution with a given precision. For empirical risk minimization approaches, the question of the regularization is also studied together with proposing a new regularization which contributes to the better complexity bounds in comparison with quadratic regularization. The computational focus is developing algorithms for calculating Wasserstein barycenters: both primal and dual algorithms which can be executed in a decentralized manner. The motivation for dual approaches is closed-forms for the dual formulation of entropy-regularized Wasserstein distances and their derivatives, whereas the primal formulation has closed-form expression only in some cases, e.g., for Gaussian measures. Moreover, the dual oracle returning the gradient of the dual representation for entropy-regularized Wasserstein distance can be computed for a cheaper price in comparison with the primal oracle returning the gradient of the entropy-regularized Wasserstein distance. The number of dual oracle calls, in this case, will also be less, i.e., the square root of the number of primal oracle calls. This explains the successful application of the first-order dual approaches for the Wasserstein barycenter problem.
Image interpolation, or image morphing, refers to a visual transition between two (or more) input images. For such a transition to look visually appealing, its desirable properties are (i) to be smooth; (ii) to apply the minimal required change in the image; and (iii) to seem real, avoiding unnatural artifacts in each image in the transition. To obtain a smooth and straightforward transition, one may adopt the well-known Wasserstein Barycenter Problem (WBP). While this approach guarantees minimal changes under the Wasserstein metric, the resulting images might seem unnatural. In this work, we propose a novel approach for image morphing that possesses all three desired properties. To this end, we define a constrained variant of the WBP that enforces the intermediate images to satisfy an image prior. We describe an algorithm that solves this problem and demonstrate it using the sparse prior and generative adversarial networks.
In this work we introduce the concept of Bures-Wasserstein barycenter $Q_*$, that is essentially a Frechet mean of some distribution $mathbb{P}$ supported on a subspace of positive semi-definite Hermitian operators $mathbb{H}_{+}(d)$. We allow a barycenter to be restricted to some affine subspace of $mathbb{H}_{+}(d)$ and provide conditions ensuring its existence and uniqueness. We also investigate convergence and concentration properties of an empirical counterpart of $Q_*$ in both Frobenius norm and Bures-Wasserstein distance, and explain, how obtained results are connected to optimal transportation theory and can be applied to statistical inference in quantum mechanics.
This paper presents an efficient algorithm for the progressive approximation of Wasserstein barycenters of persistence diagrams, with applications to the visual analysis of ensemble data. Given a set of scalar fields, our approach enables the computation of a persistence diagram which is representative of the set, and which visually conveys the number, data ranges and saliences of the main features of interest found in the set. Such representative diagrams are obtained by computing explicitly the discrete Wasserstein barycenter of the set of persistence diagrams, a notoriously computationally intensive task. In particular, we revisit efficient algorithms for Wasserstein distance approximation [12,51] to extend previous work on barycenter estimation [94]. We present a new fast algorithm, which progressively approximates the barycenter by iteratively increasing the computation accuracy as well as the number of persistent features in the output diagram. Such a progressivity drastically improves convergence in practice and allows to design an interruptible algorithm, capable of respecting computation time constraints. This enables the approximation of Wasserstein barycenters within interactive times. We present an application to ensemble clustering where we revisit the k-means algorithm to exploit our barycenters and compute, within execution time constraints, meaningful clusters of ensemble data along with their barycenter diagram. Extensive experiments on synthetic and real-life data sets report that our algorithm converges to barycenters that are qualitatively meaningful with regard to the applications, and quantitatively comparable to previous techniques, while offering an order of magnitude speedup when run until convergence (without time constraint). Our algorithm can be trivially parallelized to provide additional speedups in practice on standard workstations. [...]
This paper presents a unified computational framework for the estimation of distances, geodesics and barycenters of merge trees. We extend recent work on the edit distance [106] and introduce a new metric, called the Wasserstein distance between merge trees, which is purposely designed to enable efficient computations of geodesics and barycenters. Specifically, our new distance is strictly equivalent to the L2-Wasserstein distance between extremum persistence diagrams, but it is restricted to a smaller solution space, namely, the space of rooted partial isomorphisms between branch decomposition trees. This enables a simple extension of existing optimization frameworks [112] for geodesics and barycenters from persistence diagrams to merge trees. We introduce a task-based algorithm which can be generically applied to distance, geodesic, barycenter or cluster computation. The task-based nature of our approach enables further accelerations with shared-memory parallelism. Extensive experiments on public ensembles and SciVis contest benchmarks demonstrate the efficiency of our approach -- with barycenter computations in the orders of minutes for the largest examples -- as well as its qualitative ability to generate representative barycenter merge trees, visually summarizing the features of interest found in the ensemble. We show the utility of our contributions with dedicated visualization applications: feature tracking, temporal reduction and ensemble clustering. We provide a lightweight C++ implementation that can be used to reproduce our results.