Clustering under Perturbation Stability in Near-Linear Time


Abstract in English

We consider the problem of center-based clustering in low-dimensional Euclidean spaces under the perturbation stability assumption. An instance is $alpha$-stable if the underlying optimal clustering continues to remain optimal even when all pairwise distances are arbitrarily perturbed by a factor of at most $alpha$. Our main contribution is in presenting efficient exact algorithms for $alpha$-stable clustering instances whose running times depend near-linearly on the size of the data set when $alpha ge 2 + sqrt{3}$. For $k$-center and $k$-means problems, our algorithms also achieve polynomial dependence on the number of clusters, $k$, when $alpha geq 2 + sqrt{3} + epsilon$ for any constant $epsilon > 0$ in any fixed dimension. For $k$-median, our algorithms have polynomial dependence on $k$ for $alpha > 5$ in any fixed dimension; and for $alpha geq 2 + sqrt{3}$ in two dimensions. Our algorithms are simple, and only require applying techniques such as local search or dynamic programming to a suitably modified metric space, combined with careful choice of data structures.

Download