ترغب بنشر مسار تعليمي؟ اضغط هنا

Approximation algorithms for 1-Wasserstein distance between persistence diagrams

92   0   0.0 ( 0 )
 نشر من قبل Samantha Chen
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Recent years have witnessed a tremendous growth using topological summaries, especially the persistence diagrams (encoding the so-called persistent homology) for analyzing complex shapes. Intuitively, persistent homology maps a potentially complex input object (be it a graph, an image, or a point set and so on) to a unified type of feature summary, called the persistence diagrams. One can then carry out downstream data analysis tasks using such persistence diagram representations. A key problem is to compute the distance between two persistence diagrams efficiently. In particular, a persistence diagram is essentially a multiset of points in the plane, and one popular distance is the so-called 1-Wasserstein distance between persistence diagrams. In this paper, we present two algorithms to approximate the 1-Wasserstein distance for persistence diagrams in near-linear time. These algorithms primarily follow the same ideas as two existing algorithms to approximate optimal transport between two finite point-sets in Euclidean spaces via randomly shifted quadtrees. We show how these algorithms can be effectively adapted for the case of persistence diagrams. Our algorithms are much more efficient than previous exact and approximate algorithms, both in theory and in practice, and we demonstrate its efficiency via extensive experiments. They are conceptually simple and easy to implement, and the code is publicly available in github.



قيم البحث

اقرأ أيضاً

We improve the running times of $O(1)$-approximation algorithms for the set cover problem in geometric settings, specifically, covering points by disks in the plane, or covering points by halfspaces in three dimensions. In the unweighted case, Agarwa l and Pan [SoCG 2014] gave a randomized $O(nlog^4 n)$-time, $O(1)$-approximation algorithm, by using variants of the multiplicative weight update (MWU) method combined with geometric data structures. We simplify the data structure requirement in one of their methods and obtain a deterministic $O(nlog^3 nloglog n)$-time algorithm. With further new ideas, we obtain a still faster randomized $O(nlog n(loglog n)^{O(1)})$-time algorithm. For the weighted problem, we also give a randomized $O(nlog^4nloglog n)$-time, $O(1)$-approximation algorithm, by simple modifications to the MWU method and the quasi-uniform sampling technique.
Given $n$ points in a $d$ dimensional Euclidean space, the Minimum Enclosing Ball (MEB) problem is to find the ball with the smallest radius which contains all $n$ points. We give a $O(ndQcal/sqrt{epsilon})$ approximation algorithm for producing an e nclosing ball whose radius is at most $epsilon$ away from the optimum (where $Qcal$ is an upper bound on the norm of the points). This improves existing results using emph{coresets}, which yield a $O(nd/epsilon)$ greedy algorithm. Finding the Minimum Enclosing Convex Polytope (MECP) is a related problem wherein a convex polytope of a fixed shape is given and the aim is to find the smallest magnification of the polytope which encloses the given points. For this problem we present a $O(mndQcal/epsilon)$ approximation algorithm, where $m$ is the number of faces of the polytope. Our algorithms borrow heavily from convex duality and recently developed techniques in non-smooth optimization, and are in contrast with existing methods which rely on geometric arguments. In particular, we specialize the excessive gap framework of citet{Nesterov05a} to obtain our results.
Given a persistence diagram with $n$ points, we give an algorithm that produces a sequence of $n$ persistence diagrams converging in bottleneck distance to the input diagram, the $i$th of which has $i$ distinct (weighted) points and is a $2$-approxim ation to the closest persistence diagram with that many distinct points. For each approximation, we precompute the optimal matching between the $i$th and the $(i+1)$st. Perhaps surprisingly, the entire sequence of diagrams as well as the sequence of matchings can be represented in $O(n)$ space. The main approach is to use a variation of the greedy permutation of the persistence diagram to give good Hausdorff approximations and assign weights to these subsets. We give a new algorithm to efficiently compute this permutation, despite the high implicit dimension of points in a persistence diagram due to the effect of the diagonal. The sketches are also structured to permit fast (linear time) approximations to the Hausdorff distance between diagrams -- a lower bound on the bottleneck distance. For approximating the bottleneck distance, sketches can also be used to compute a linear-size neighborhood graph directly, obviating the need for geometric data structures used in state-of-the-art methods for bottleneck computation.
We describe a new data structure for dynamic nearest neighbor queries in the plane with respect to a general family of distance functions. These include $L_p$-norms and additively weighted Euclidean distances. Our data structure supports general (con vex, pairwise disjoint) sites that have constant description complexity (e.g., points, line segments, disks, etc.). Our structure uses $O(n log^3 n)$ storage, and requires polylogarithmic update and query time, improving an earlier data structure of Agarwal, Efrat and Sharir that required $O(n^varepsilon)$ time for an update and $O(log n)$ time for a query [SICOMP, 1999]. Our data structure has numerous applications. In all of them, it gives faster algorithms, typically reducing an $O(n^varepsilon)$ factor in the previous bounds to polylogarithmic. In addition, we give here two new applications: an efficient construction of a spanner in a disk intersection graph, and a data structure for efficient connectivity queries in a dynamic disk graph.
In the Euclidean TSP with neighborhoods (TSPN), we are given a collection of n regions (neighborhoods) and we seek a shortest tour that visits each region. As a generalization of the classical Euclidean TSP, TSPN is also NP-hard. In this paper, we pr esent new approximation results for the TSPN, including (1) a constant-factor approximation algorithm for the case of arbitrary connected neighborhoods having comparable diameters; and (2) a PTAS for the important special case of disjoint unit disk neighborhoods (or nearly disjoint, nearly-unit disks). Our methods also yield improved approximation ratios for various special classes of neighborhoods, which have previously been studied. Further, we give a linear-time O(1)-approximation algorithm for the case of neighborhoods that are (infinite) straight lines.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا