No Arabic abstract
Network decomposition is a central concept in the study of distributed graph algorithms. We present the first polylogarithmic-round deterministic distributed algorithm with small messages that constructs a strong-diameter network decomposition with polylogarithmic parameters. Concretely, a ($C$, $D$) strong-diameter network decomposition is a partitioning of the nodes of the graph into disjoint clusters, colored with $C$ colors, such that neighboring clusters have different colors and the subgraph induced by each cluster has a diameter at most $D$. In the weak-diameter variant, the requirement is relaxed by measuring the diameter of each cluster in the original graph, instead of the subgraph induced by the cluster. A recent breakthrough of Rozhov{n} and Ghaffari [STOC 2020] presented the first $text{poly}(log n)$-round deterministic algorithm for constructing a weak-diameter network decomposition where $C$ and $D$ are both in $text{poly}(log n)$. Their algorithm uses small $O(log n)$-bit messages. One can transform their algorithm to a strong-diameter network decomposition algorithm with similar parameters. However, that comes at the expense of requiring unbounded messages. The key remaining qualitative question in the study of network decompositions was whether one can achieve a similar result for strong-diameter network decompositions using small messages. We resolve this question by presenting a novel technique that can transform any black-box weak-diameter network decomposition algorithm to a strong-diameter one, using small messages and with only moderate loss in the parameters.
Network decomposition is a central tool in distributed graph algorithms. We present two improvements on the state of the art for network decomposition, which thus lead to improvements in the (deterministic and randomized) complexity of several well-studied graph problems. - We provide a deterministic distributed network decomposition algorithm with $O(log^5 n)$ round complexity, using $O(log n)$-bit messages. This improves on the $O(log^7 n)$-round algorithm of Rozhov{n} and Ghaffari [STOC20], which used large messages, and their $O(log^8 n)$-round algorithm with $O(log n)$-bit messages. This directly leads to similar improvements for a wide range of deterministic and randomized distributed algorithms, whose solution relies on network decomposition, including the general distributed derandomization of Ghaffari, Kuhn, and Harris [FOCS18]. - One drawback of the algorithm of Rozhov{n} and Ghaffari, in the $mathsf{CONGEST}$ model, was its dependence on the length of the identifiers. Because of this, for instance, the algorithm could not be used in the shattering framework in the $mathsf{CONGEST}$ model. Thus, the state of the art randomized complexity of several problems in this model remained with an additive $2^{O(sqrt{loglog n})}$ term, which was a clear leftover of the older network decomposition complexity [Panconesi and Srinivasan STOC92]. We present a modified version that remedies this, constructing a decomposition whose quality does not depend on the identifiers, and thus improves the randomized round complexity for various problems.
We present a simple deterministic distributed algorithm that computes a $(Delta+1)$-vertex coloring in $O(log^2 Delta cdot log n)$ rounds. The algorithm can be implemented with $O(log n)$-bit messages. The algorithm can also be extended to the more general $(degree+1)$-list coloring problem. Obtaining a polylogarithmic-time deterministic algorithm for $(Delta+1)$-vertex coloring had remained a central open question in the area of distributed graph algorithms since the 1980s, until a recent network decomposition algorithm of Rozhov{n} and Ghaffari [STOC20]. The current state of the art is based on an improved variant of their decomposition, which leads to an $O(log^5 n)$-round algorithm for $(Delta+1)$-vertex coloring. Our coloring algorithm is completely different and considerably simpler and faster. It solves the coloring problem in a direct way, without using network decomposition, by gradually rounding a certain fractional color assignment until reaching an integral color assignments. Moreover, via the approach of Chang, Li, and Pettie [STOC18], this improved deterministic algorithm also leads to an improvement in the complexity of randomized algorithms for $(Delta+1)$-coloring, now reaching the bound of $O(log^3log n)$ rounds. As a further application, we also provide faster deterministic distributed algorithms for the following variants of the vertex coloring problem. In graphs of arboricity $a$, we show that a $(2+epsilon)a$-vertex coloring can be computed in $O(log^3 acdotlog n)$ rounds. We also show that for $Deltageq 3$, a $Delta$-coloring of a $Delta$-colorable graph $G$ can be computed in $O(log^2 Deltacdotlog^2 n)$ rounds.
We study graph connectivity problem in MPC model. On an undirected graph with $n$ nodes and $m$ edges, $O(log n)$ round connectivity algorithms have been known for over 35 years. However, no algorithms with better complexity bounds were known. In this work, we give fully scalable, faster algorithms for the connectivity problem, by parameterizing the time complexity as a function of the diameter of the graph. Our main result is a $O(log D loglog_{m/n} n)$ time connectivity algorithm for diameter-$D$ graphs, using $Theta(m)$ total memory. If our algorithm can use more memory, it can terminate in fewer rounds, and there is no lower bound on the memory per processor. We extend our results to related graph problems such as spanning forest, finding a DFS sequence, exact/approximate minimum spanning forest, and bottleneck spanning forest. We also show that achieving similar bounds for reachability in directed graphs would imply faster boolean matrix multiplication algorithms. We introduce several new algorithmic ideas. We describe a general technique called double exponential speed problem size reduction which roughly means that if we can use total memory $N$ to reduce a problem from size $n$ to $n/k$, for $k=(N/n)^{Theta(1)}$ in one phase, then we can solve the problem in $O(loglog_{N/n} n)$ phases. In order to achieve this fast reduction for graph connectivity, we use a multistep algorithm. One key step is a carefully constructed truncated broadcasting scheme where each node broadcasts neighbor sets to its neighbors in a way that limits the size of the resulting neighbor sets. Another key step is random leader contraction, where we choose a smaller set of leaders than many previous works do.
We present improved distributed algorithms for triangle detection and its variants in the CONGEST model. We show that Triangle Detection, Counting, and Enumeration can be solved in $tilde{O}(n^{1/2})$ rounds. In contrast, the previous state-of-the-art bounds for Triangle Detection and Enumeration were $tilde{O}(n^{2/3})$ and $tilde{O}(n^{3/4})$, respectively, due to Izumi and LeGall (PODC 2017). The main technical novelty in this work is a distributed graph partitioning algorithm. We show that in $tilde{O}(n^{1-delta})$ rounds we can partition the edge set of the network $G=(V,E)$ into three parts $E=E_mcup E_scup E_r$ such that (a) Each connected component induced by $E_m$ has minimum degree $Omega(n^delta)$ and conductance $Omega(1/text{poly} log(n))$. As a consequence the mixing time of a random walk within the component is $O(text{poly} log(n))$. (b) The subgraph induced by $E_s$ has arboricity at most $n^{delta}$. (c) $|E_r| leq |E|/6$. All of our algorithms are based on the following generic framework, which we believe is of interest beyond this work. Roughly, we deal with the set $E_s$ by an algorithm that is efficient for low-arboricity graphs, and deal with the set $E_r$ using recursive calls. For each connected component induced by $E_m$, we are able to simulate congested clique algorithms with small overhead by applying a routing algorithm due to Ghaffari, Kuhn, and Su (PODC 2017) for high conductance graphs.
Maintaining a $k$-core decomposition quickly in a dynamic graph is an important problem in many applications, including social network analytics, graph visualization, centrality measure computations, and community detection algorithms. The main challenge for designing efficient $k$-core decomposition algorithms is that a single change to the graph can cause the decomposition to change significantly. We present the first parallel batch-dynamic algorithm for maintaining an approximate $k$-core decomposition that is efficient in both theory and practice. Given an initial graph with $m$ edges, and a batch of $B$ updates, our algorithm maintains a $(2 + delta)$-approximation of the coreness values for all vertices (for any constant $delta > 0$) in $O(Blog^2 m)$ amortized work and $O(log^2 m loglog m)$ depth (parallel time) with high probability. Our algorithm also maintains a low out-degree orientation of the graph in the same bounds. We implemented and experimentally evaluated our algorithm on a 30-core machine with two-way hyper-threading on $11$ graphs of varying densities and sizes. Compared to the state-of-the-art algorithms, our algorithm achieves up to a 114.52x speedup against the best multicore implementation and up to a 497.63x speedup against the best sequential algorithm, obtaining results for graphs that are orders-of-magnitude larger than those used in previous studies. In addition, we present the first approximate static $k$-core algorithm with linear work and polylogarithmic depth. We show that on a 30-core machine with two-way hyper-threading, our implementation achieves up to a 3.9x speedup in the static case over the previous state-of-the-art parallel algorithm.