No Arabic abstract
We prove better lower bounds on additive spanners and emulators, which are lossy compression schemes for undirected graphs, as well as lower bounds on shortcut sets, which reduce the diameter of directed graphs. We show that any $O(n)$-size shortcut set cannot bring the diameter below $Omega(n^{1/6})$, and that any $O(m)$-size shortcut set cannot bring it below $Omega(n^{1/11})$. These improve Hesses [Hesse03] lower bound of $Omega(n^{1/17})$. By combining these constructions with Abboud and Bodwins [AbboudB17] edge-splitting technique, we get additive stretch lower bounds of $+Omega(n^{1/11})$ for $O(n)$-size spanners and $+Omega(n^{1/18})$ for $O(n)$-size emulators. These improve Abboud and Bodwins $+Omega(n^{1/22})$ lower bounds.
It was recently found that there are very close connections between the existence of additive spanners (subgraphs where all distances are preserved up to an additive stretch), distance preservers (subgraphs in which demand pairs have their distance preserved exactly), and pairwise spanners (subgraphs in which demand pairs have their distance preserved up to a multiplicative or additive stretch) [Abboud-Godwin SODA 16, Godwin-Williams SODA 16]. We study these problems from an optimization point of view, where rather than studying the existence of extremal instances we are given an instance and are asked to find the sparsest possible spanner/preserver. We give an $O(n^{3/5 + epsilon})$-approximation for distance preservers and pairwise spanners (for arbitrary constant $epsilon > 0$). This is the first nontrivial upper bound for either problem, both of which are known to be as hard to approximate as Label Cover. We also prove Label Cover hardness for approximating additive spanners, even for the cases of additive 1 stretch (where one might expect a polylogarithmic approximation, since the related multiplicative 2-spanner problem admits an $O(log n)$-approximation) and additive polylogarithmic stretch (where the related multiplicative spanner problem has an $O(1)$-approximation). Interestingly, the techniques we use in our approximation algorithm extend beyond distance-based problem to pure connectivity network design problems. In particular, our techniques allow us to give an $O(n^{3/5 + epsilon})$-approximation for the Directed Steiner Forest problem (for arbitrary constant $epsilon > 0$) when all edges have uniform costs, improving the previous best $O(n^{2/3 + epsilon})$-approximation due to Berman et al.~[ICALP 11] (which holds for general edge costs).
We consider a range of simply stated dynamic data structure problems on strings. An update changes one symbol in the input and a query asks us to compute some function of the pattern of length $m$ and a substring of a longer text. We give both conditional and unconditional lower bounds for variants of exact matching with wildcards, inner product, and Hamming distance computation via a sequence of reductions. As an example, we show that there does not exist an $O(m^{1/2-varepsilon})$ time algorithm for a large range of these problems unless the online Boolean matrix-vector multiplication conjecture is false. We also provide nearly matching upper bounds for most of the problems we consider.
Among the most important graph parameters is the Diameter, the largest distance between any two vertices. There are no known very efficient algorithms for computing the Diameter exactly. Thus, much research has been devoted to how fast this parameter can be approximated. Chechik et al. showed that the diameter can be approximated within a multiplicative factor of $3/2$ in $tilde{O}(m^{3/2})$ time. Furthermore, Roditty and Vassilevska W. showed that unless the Strong Exponential Time Hypothesis (SETH) fails, no $O(n^{2-epsilon})$ time algorithm can achieve an approximation factor better than $3/2$ in sparse graphs. Thus the above algorithm is essentially optimal for sparse graphs for approximation factors less than $3/2$. It was, however, completely plausible that a $3/2$-approximation is possible in linear time. In this work we conditionally rule out such a possibility by showing that unless SETH fails no $O(m^{3/2-epsilon})$ time algorithm can achieve an approximation factor better than $5/3$. Another fundamental set of graph parameters are the Eccentricities. The Eccentricity of a vertex $v$ is the distance between $v$ and the farthest vertex from $v$. Chechik et al. showed that the Eccentricities of all vertices can be approximated within a factor of $5/3$ in $tilde{O}(m^{3/2})$ time and Abboud et al. showed that no $O(n^{2-epsilon})$ algorithm can achieve better than $5/3$ approximation in sparse graphs. We show that the runtime of the $5/3$ approximation algorithm is also optimal under SETH. We also show that no near-linear time algorithm can achieve a better than $2$ approximation for the Eccentricities and that this is essentially tight: we give an algorithm that approximates Eccentricities within a $2+delta$ factor in $tilde{O}(m/delta)$ time for any $0<delta<1$. This beats all Eccentricity algorithms in Cairo et al.
We consider the problem of testing graph cluster structure: given access to a graph $G=(V, E)$, can we quickly determine whether the graph can be partitioned into a few clusters with good inner conductance, or is far from any such graph? This is a generalization of the well-studied problem of testing graph expansion, where one wants to distinguish between the graph having good expansion (i.e. being a good single cluster) and the graph having a sparse cut (i.e. being a union of at least two clusters). A recent work of Czumaj, Peng, and Sohler (STOC15) gave an ingenious sublinear time algorithm for testing $k$-clusterability in time $tilde{O}(n^{1/2} text{poly}(k))$: their algorithm implicitly embeds a random sample of vertices of the graph into Euclidean space, and then clusters the samples based on estimates of Euclidean distances between the points. This yields a very efficient testing algorithm, but only works if the cluster structure is very strong: it is necessary to assume that the gap between conductances of accepted and rejected graphs is at least logarithmic in the size of the graph $G$. In this paper we show how one can leverage more refined geometric information, namely angles as opposed to distances, to obtain a sublinear time tester that works even when the gap is a sufficiently large constant. Our tester is based on the singular value decomposition of a natural matrix derived from random walk transition probabilities from a small sample of seed nodes. We complement our algorithm with a matching lower bound on the query complexity of testing clusterability. Our lower bound is based on a novel property testing problem, which we analyze using Fourier analytic tools. As a byproduct of our techniques, we also achieve new lower bounds for the problem of approximating MAX-CUT value in sublinear time.
We give lower bounds on the performance of two of the most popular sampling methods in practice, the Metropolis-adjusted Langevin algorithm (MALA) and multi-step Hamiltonian Monte Carlo (HMC) with a leapfrog integrator, when applied to well-conditioned distributions. Our main result is a nearly-tight lower bound of $widetilde{Omega}(kappa d)$ on the mixing time of MALA from an exponentially warm start, matching a line of algorithmic results up to logarithmic factors and answering an open question of Chewi et. al. We also show that a polynomial dependence on dimension is necessary for the relaxation time of HMC under any number of leapfrog steps, and bound the gains achievable by changing the step count. Our HMC analysis draws upon a novel connection between leapfrog integration and Chebyshev polynomials, which may be of independent interest.