No Arabic abstract
The natural pseudo-distance of spaces endowed with filtering functions is precious for shape classification and retrieval; its optimal estimate coming from persistence diagrams is the bottleneck distance, which unfortunately suffers from combinatorial explosion. A possible algebraic representation of persistence diagrams is offered by complex polynomials; since far polynomials represent far persistence diagrams, a fast comparison of the coefficient vectors can reduce the size of the database to be classified by the bottleneck distance. This article explores experimentally three transformations from diagrams to polynomials and three distances between the complex vectors of coefficients.
We introduce a refinement of the persistence diagram, the graded persistence diagram. It is the Mobius inversion of the graded rank function, which is obtained from the rank function using the unary numeral system. Both persistence diagrams and graded persistence diagrams are integer-valued functions on the Cartesian plane. Whereas the persistence diagram takes non-negative values, the graded persistence diagram takes values of 0, 1, or -1. The sum of the graded persistence diagrams is the persistence diagram. We show that the positive and negative points in the k-th graded persistence diagram correspond to the local maxima and minima, respectively, of the k-th persistence landscape. We prove a stability theorem for graded persistence diagrams: the 1-Wasserstein distance between k-th graded persistence diagrams is bounded by twice the 1-Wasserstein distance between the corresponding persistence diagrams, and this bound is attained. In the other direction, the 1-Wasserstein distance is a lower bound for the sum of the 1-Wasserstein distances between the k-th graded persistence diagrams. In fact, the 1-Wasserstein distance for graded persistence diagrams is more discriminative than the 1-Wasserstein distance for the corresponding persistence diagrams.
Biological and physical systems often exhibit distinct structures at different spatial/temporal scales. Persistent homology is an algebraic tool that provides a mathematical framework for analyzing the multi-scale structures frequently observed in nature. In this paper a theoretical framework for the algorithmic computation of an arbitrarily good approximation of the persistent homology is developed. We study the filtrations generated by sub-level sets of a function $f : X to mathbb{R}$, where $X$ is a CW-complex. In the special case $X = [0,1]^N$, $N in mathbb{N}$ we discuss implementation of the proposed algorithms. We also investigate a priori and a posteriori bounds of the approximation error introduced by our method.
The extended persistence diagram is an invariant of piecewise linear functions, introduced by Cohen-Steiner, Edelsbrunner, and Harer. The bottleneck distance has been introduced by the same authors as an extended pseudometric on the set of extended persistence diagrams, which is stable under perturbations of the function. We address the question whether the bottleneck distance is the largest possible stable distance, providing an affirmative answer.
Given a persistence diagram with $n$ points, we give an algorithm that produces a sequence of $n$ persistence diagrams converging in bottleneck distance to the input diagram, the $i$th of which has $i$ distinct (weighted) points and is a $2$-approximation to the closest persistence diagram with that many distinct points. For each approximation, we precompute the optimal matching between the $i$th and the $(i+1)$st. Perhaps surprisingly, the entire sequence of diagrams as well as the sequence of matchings can be represented in $O(n)$ space. The main approach is to use a variation of the greedy permutation of the persistence diagram to give good Hausdorff approximations and assign weights to these subsets. We give a new algorithm to efficiently compute this permutation, despite the high implicit dimension of points in a persistence diagram due to the effect of the diagonal. The sketches are also structured to permit fast (linear time) approximations to the Hausdorff distance between diagrams -- a lower bound on the bottleneck distance. For approximating the bottleneck distance, sketches can also be used to compute a linear-size neighborhood graph directly, obviating the need for geometric data structures used in state-of-the-art methods for bottleneck computation.
The classical persistence algorithm virtually computes the unique decomposition of a persistence module implicitly given by an input simplicial filtration. Based on matrix reduction, this algorithm is a cornerstone of the emergent area of topological data analysis. Its input is a simplicial filtration defined over the integers $mathbb{Z}$ giving rise to a $1$-parameter persistence module. It has been recognized that multi-parameter version of persistence modules given by simplicial filtrations over $d$-dimensional integer grids $mathbb{Z}^d$ is equally or perhaps more important in data science applications. However, in the multi-parameter setting, one of the main challenges is that topological summaries based on algebraic structure such as decompositions and bottleneck distances cannot be as efficiently computed as in the $1$-parameter case because there is no known extension of the persistence algorithm to multi-parameter persistence modules. We present an efficient algorithm to compute the unique decomposition of a finitely presented persistence module $M$ defined over the multiparameter $mathbb{Z}^d$.The algorithm first assumes that the module is presented with a set of $N$ generators and relations that are emph{distinctly graded}. Based on a generalized matrix reduction technique it runs in $O(N^{2omega+1})$ time where $omega<2.373$ is the exponent for matrix multiplication. This is much better than the well known algorithm called Meataxe which runs in $tilde{O}(N^{6(d+1)})$ time on such an input. In practice, persistence modules are usually induced by simplicial filtrations. With such an input consisting of $n$ simplices, our algorithm runs in $O(n^{2omega+1})$ time for $d=2$ and in $O(n^{d(2omega + 1)})$ time for $d>2$.