No Arabic abstract
In this paper we give fast distributed graph algorithms for detecting and listing small subgraphs, and for computing or approximating the girth. Our algorithms improve upon the state of the art by polynomial factors, and for girth, we obtain an constant-time algorithm for additive +1 approximation in the Congested Clique, and the first parametrized algorithm for exact computation in CONGEST. In the Congested Clique, we develop a technique for learning small neighborhoods, and apply it to obtain an $O(1)$-round algorithm that computes the girth with only an additive +1 error. Next, we introduce a new technique (the partition tree technique) allowing for efficiently and deterministically listing all copies of any subgraph, improving upon the state-of the-art for non-dense graphs. We give two applications of this technique: First we show that for constant $k$, $C_{2k}$-detection can be solved in $O(1)$ rounds in the Congested Clique, improving on prior work which used matrix multiplication and had polynomial round complexity. Second, we show that in triangle-free graphs, the girth can be exactly computed in time polynomially faster than the best known bounds for general graphs. In CONGEST, we describe a new approach for finding cycles, and apply it in two ways: first we show a fast parametrized algorithm for girth with round complexity $tilde{O}(min(gcdot n^{1-1/Theta(g)},n))$ for any girth $g$; and second, we show how to find small even-length cycles $C_{2k}$ for $k = 3,4,5$ in $O(n^{1-1/k})$ rounds, which is a polynomial improvement upon the previous running times. Finally, using our improved $C_6$-freeness algorithm and the barrier on proving lower bounds on triangle-freeness of Eden et al., we show that improving the current $tildeOmega(sqrt{n})$ lower bound for $C_6$-freeness of Korhonen et al. by any polynomial factor would imply strong circuit complexity lower bounds.
The girth of a graph, i.e. the length of its shortest cycle, is a fundamental graph parameter. Unfortunately all known algorithms for computing, even approximately, the girth and girth-related structures in directed weighted $m$-edge and $n$-node graphs require $Omega(min{n^{omega}, mn})$ time (for $2leqomega<2.373$). In this paper, we drastically improve these runtimes as follows: * Multiplicative Approximations in Nearly Linear Time: We give an algorithm that in $widetilde{O}(m)$ time computes an $widetilde{O}(1)$-multiplicative approximation of the girth as well as an $widetilde{O}(1)$-multiplicative roundtrip spanner with $widetilde{O}(n)$ edges with high probability (w.h.p). * Nearly Tight Additive Approximations: For unweighted graphs and any $alpha in (0,1)$ we give an algorithm that in $widetilde{O}(mn^{1 - alpha})$ time computes an $O(n^alpha)$-additive approximation of the girth w.h.p, and partially derandomize it. We show that the runtime of our algorithm cannot be significantly improved without a breakthrough in combinatorial Boolean matrix multiplication. Our main technical contribution to achieve these results is the first nearly linear time algorithm for computing roundtrip covers, a directed graph decomposition concept key to previous roundtrip spanner constructions. Previously it was not known how to compute these significantly faster than $Omega(min{n^omega, mn})$ time. Given the traditional difficulty in efficiently processing directed graphs, we hope our techniques may find further applications.
In the minimum $k$-edge-connected spanning subgraph ($k$-ECSS) problem the goal is to find the minimum weight subgraph resistant to up to $k-1$ edge failures. This is a central problem in network design, and a natural generalization of the minimum spanning tree (MST) problem. While the MST problem has been studied extensively by the distributed computing community, for $k geq 2$ less is known in the distributed setting. In this paper, we present fast randomized distributed approximation algorithms for $k$-ECSS in the CONGEST model. Our first contribution is an $widetilde{O}(D + sqrt{n})$-round $O(log{n})$-approximation for 2-ECSS, for a graph with $n$ vertices and diameter $D$. The time complexity of our algorithm is almost tight and almost matches the time complexity of the MST problem. For larger constant values of $k$ we give an $widetilde{O}(n)$-round $O(log{n})$-approximation. Additionally, in the special case of unweighted 3-ECSS we show how to improve the time complexity to $O(D log^3{n})$ rounds. All our results significantly improve the time complexity of previous algorithms.
Subgraph counting is a fundamental problem in analyzing massive graphs, often studied in the context of social and complex networks. There is a rich literature on designing efficient, accurate, and scalable algorithms for this problem. In this work, we tackle this challenge and design several new algorithms for subgraph counting in the Massively Parallel Computation (MPC) model: Given a graph $G$ over $n$ vertices, $m$ edges and $T$ triangles, our first main result is an algorithm that, with high probability, outputs a $(1+varepsilon)$-approximation to $T$, with optimal round and space complexity provided any $S geq max{(sqrt m, n^2/m)}$ space per machine, assuming $T=Omega(sqrt{m/n})$. Our second main result is an $tilde{O}_{delta}(log log n)$-rounds algorithm for exactly counting the number of triangles, parametrized by the arboricity $alpha$ of the input graph. The space per machine is $O(n^{delta})$ for any constant $delta$, and the total space is $O(malpha)$, which matches the time complexity of (combinatorial) triangle counting in the sequential model. We also prove that this result can be extended to exactly counting $k$-cliques for any constant $k$, with the same round complexity and total space $O(malpha^{k-2})$. Alternatively, allowing $O(alpha^2)$ space per machine, the total space requirement reduces to $O(nalpha^2)$. Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most $5$, can be implemented in the MPC model in $tilde{O}_{delta}(sqrt{log n})$ rounds, $O(n^{delta})$ space per machine and $O(malpha^3)$ total space. Therefore, this result also exhibits the phenomenon that a time bound in the sequential model translates to a space bound in the MPC model.
This paper gives poly-logarithmic-round, distributed D-approximation algorithms for covering problems with submodular cost and monotone covering constraints (Submodular-cost Covering). The approximation ratio D is the maximum number of variables in any constraint. Special cases include Covering Mixed Integer Linear Programs (CMIP), and Weighted Vertex Cover (with D=2). Via duality, the paper also gives poly-logarithmic-round, distributed D-approximation algorithms for Fractional Packing linear programs (where D is the maximum number of constraints in which any variable occurs), and for Max Weighted c-Matching in hypergraphs (where D is the maximum size of any of the hyperedges; for graphs D=2). The paper also gives parallel (RNC) 2-approximation algorithms for CMIP with two variables per constraint and Weighted Vertex Cover. The algorithms are randomized. All of the approximation ratios exactly match those of comparable centralized algorithms.
The minimum degree spanning tree (MDST) problem requires the construction of a spanning tree $T$ for graph $G=(V,E)$ with $n$ vertices, such that the maximum degree $d$ of $T$ is the smallest among all spanning trees of $G$. In this paper, we present two new distributed approximation algorithms for the MDST problem. Our first result is a randomized distributed algorithm that constructs a spanning tree of maximum degree $hat d = O(dlog{n})$. It requires $O((D + sqrt{n}) log^2 n)$ rounds (w.h.p.), where $D$ is the graph diameter, which matches (within log factors) the optimal round complexity for the related minimum spanning tree problem. Our second result refines this approximation factor by constructing a tree with maximum degree $hat d = O(d + log{n})$, though at the cost of additional polylogarithmic factors in the round complexity. Although efficient approximation algorithms for the MDST problem have been known in the sequential setting since the 1990s, our results are first efficient distributed solutions for this problem.