No Arabic abstract
We consider the problem of center-based clustering in low-dimensional Euclidean spaces under the perturbation stability assumption. An instance is $alpha$-stable if the underlying optimal clustering continues to remain optimal even when all pairwise distances are arbitrarily perturbed by a factor of at most $alpha$. Our main contribution is in presenting efficient exact algorithms for $alpha$-stable clustering instances whose running times depend near-linearly on the size of the data set when $alpha ge 2 + sqrt{3}$. For $k$-center and $k$-means problems, our algorithms also achieve polynomial dependence on the number of clusters, $k$, when $alpha geq 2 + sqrt{3} + epsilon$ for any constant $epsilon > 0$ in any fixed dimension. For $k$-median, our algorithms have polynomial dependence on $k$ for $alpha > 5$ in any fixed dimension; and for $alpha geq 2 + sqrt{3}$ in two dimensions. Our algorithms are simple, and only require applying techniques such as local search or dynamic programming to a suitably modified metric space, combined with careful choice of data structures.
We show that the edit distance between two strings of length $n$ can be computed within a factor of $f(epsilon)$ in $n^{1+epsilon}$ time as long as the edit distance is at least $n^{1-delta}$ for some $delta(epsilon) > 0$.
Let $P$ be a path graph of $n$ vertices embedded in a metric space. We consider the problem of adding a new edge to $P$ so that the radius of the resulting graph is minimized, where any center is constrained to be one of the vertices of $P$. Previously, the continuous version of the problem where a center may be a point in the interior of an edge of the graph was studied and a linear-time algorithm was known. Our discrete version of the problem has not been studied before. We present a linear-time algorithm for the problem.
Diffusion is a fundamental graph procedure and has been a basic building block in a wide range of theoretical and empirical applications such as graph partitioning and semi-supervised learning on graphs. In this paper, we study computationally efficient diffusion primitives beyond random walk. We design an $widetilde{O}(m)$-time randomized algorithm for the $ell_2$-norm flow diffusion problem, a recently proposed diffusion model based on network flow with demonstrated graph clustering related applications both in theory and in practice. Examples include finding locally-biased low conductance cuts. Using a known connection between the optimal dual solution of the flow diffusion problem and the local cut structure, our algorithm gives an alternative approach for finding such cuts in nearly linear time. From a technical point of view, our algorithm contributes a novel way of dealing with inequality constraints in graph optimization problems. It adapts the high-level algorithmic framework of nearly linear time Laplacian system solvers, but requires several new tools: vertex elimination under constraints, a new family of graph ultra-sparsifiers, and accelerated proximal gradient methods with inexact proximal mapping computation.
We consider the problem of efficiently scheduling jobs with precedence constraints on a set of identical machines in the presence of a uniform communication delay. In this setting, if two precedence-constrained jobs $u$ and $v$, with ($u prec v$), are scheduled on different machines, then $v$ must start at least $rho$ time units after $u$ completes. The scheduling objective is to minimize makespan, i.e. the total time between when the first job starts and the last job completes. The focus of this paper is to provide an efficient approximation algorithm with near-linear running time. We build on the algorithm of Lepere and Rapine [STACS 2002] for this problem to give an $Oleft(frac{ln rho}{ln ln rho} right)$-approximation algorithm that runs in $tilde{O}(|V| + |E|)$ time.
Given a graph $G=(V,E)$ and an integer $k ge 1$, a $k$-hop dominating set $D$ of $G$ is a subset of $V$, such that, for every vertex $v in V$, there exists a node $u in D$ whose hop-distance from $v$ is at most $k$. A $k$-hop dominating set of minimum cardinality is called a minimum $k$-hop dominating set. In this paper, we present linear-time algorithms that find a minimum $k$-hop dominating set in unicyclic and cactus graphs. To achieve this, we show that the $k$-dominating set problem on unicycle graph reduces to the piercing circular arcs problem, and show a linear-time algorithm for piercing sorted circular arcs, which improves the best known $O(nlog n)$-time algorithm.