Sketching Persistence Diagrams


Abstract in English

Given a persistence diagram with $n$ points, we give an algorithm that produces a sequence of $n$ persistence diagrams converging in bottleneck distance to the input diagram, the $i$th of which has $i$ distinct (weighted) points and is a $2$-approximation to the closest persistence diagram with that many distinct points. For each approximation, we precompute the optimal matching between the $i$th and the $(i+1)$st. Perhaps surprisingly, the entire sequence of diagrams as well as the sequence of matchings can be represented in $O(n)$ space. The main approach is to use a variation of the greedy permutation of the persistence diagram to give good Hausdorff approximations and assign weights to these subsets. We give a new algorithm to efficiently compute this permutation, despite the high implicit dimension of points in a persistence diagram due to the effect of the diagonal. The sketches are also structured to permit fast (linear time) approximations to the Hausdorff distance between diagrams -- a lower bound on the bottleneck distance. For approximating the bottleneck distance, sketches can also be used to compute a linear-size neighborhood graph directly, obviating the need for geometric data structures used in state-of-the-art methods for bottleneck computation.

Download