No Arabic abstract
We present a near-tight analysis of the average query complexity -- `a la Nguyen and Onak [FOCS08] -- of the randomized greedy maximal matching algorithm, improving over the bound of Yoshida, Yamamoto and Ito [STOC09]. For any $n$-vertex graph of average degree $bar{d}$, this leads to the following sublinear-time algorithms for estimating the size of maximum matching and minimum vertex cover, all of which are provably time-optimal up to logarithmic factors: $bullet$ A multiplicative $(2+epsilon)$-approximation in $widetilde{O}(n/epsilon^2)$ time using adjacency list queries. This (nearly) matches an $Omega(n)$ time lower bound for any multiplicative approximation and is, notably, the first $O(1)$-approximation that runs in $o(n^{1.5})$ time. $bullet$ A $(2, epsilon n)$-approximation in $widetilde{O}((bar{d} + 1)/epsilon^2)$ time using adjacency list queries. This (nearly) matches an $Omega(bar{d}+1)$ lower bound of Parnas and Ron [TCS07] which holds for any $(O(1), epsilon n)$-approximation, and improves over the bounds of [Yoshida et al. STOC09; Onak et al. SODA12] and [Kapralov et al. SODA20]: The former two take at least quadratic time in the degree which can be as large as $Omega(n^2)$ and the latter obtains a much larger approximation. $bullet$ A $(2, epsilon n)$-approximation in $widetilde{O}(n/epsilon^3)$ time using adjacency matrix queries. This (nearly) matches an $Omega(n)$ time lower bound in this model and improves over the $widetilde{O}(nsqrt{n})$-time $(2, epsilon n)$-approximate algorithm of [Chen, Kannan, and Khanna ICALP20]. It also turns out that any non-trivial multiplicative approximation in the adjacency matrix model requires $Omega(n^2)$ time, so the additive $epsilon n$ error is necessary too. As immediate corollaries, we get improved sublinear time estimators for (variants of) TSP and an improved AMPC algorithm for maximal matching.
A common approach for designing scalable algorithms for massive data sets is to distribute the computation across, say $k$, machines and process the data using limited communication between them. A particularly appealing framework here is the simultaneous communication model whereby each machine constructs a small representative summary of its own data and one obtains an approximate/exact solution from the union of the representative summaries. If the representative summaries needed for a problem are small, then this results in a communication-efficient and round-optimal protocol. While many fundamental graph problems admit efficient solutions in this model, two prominent problems are notably absent from the list of successes, namely, the maximum matching problem and the minimum vertex cover problem. Indeed, it was shown recently that for both these problems, even achieving a polylog$(n)$ approximation requires essentially sending the entire input graph from each machine. The main insight of our work is that the intractability of matching and vertex cover in the simultaneous communication model is inherently connected to an adversarial partitioning of the underlying graph across machines. We show that when the underlying graph is randomly partitioned across machines, both these problems admit randomized composable coresets of size $widetilde{O}(n)$ that yield an $widetilde{O}(1)$-approximate solution. This results in an $widetilde{O}(1)$-approximation simultaneous protocol for these problems with $widetilde{O}(nk)$ total communication when the input is randomly partitioned across $k$ machines. We further prove the optimality of our results. Finally, by a standard application of composable coresets, our results also imply MapReduce algorithms with the same approximation guarantee in one or two rounds of communication
Reconfiguration schedules, i.e., sequences that gradually transform one solution of a problem to another while always maintaining feasibility, have been extensively studied. Most research has dealt with the decision problem of whether a reconfiguration schedule exists, and the complexity of finding one. A prime example is the reconfiguration of vertex covers. We initiate the study of batched vertex cover reconfiguration, which allows to reconfigure multiple vertices concurrently while requiring that any adversarial reconfiguration order within a batch maintains feasibility. The latter provides robustness, e.g., if the simultaneous reconfiguration of a batch cannot be guaranteed. The quality of a schedule is measured by the number of batches until all nodes are reconfigured, and its cost, i.e., the maximum size of an intermediate vertex cover. To set a baseline for batch reconfiguration, we show that for graphs belonging to one of the classes ${mathsf{cycles, trees, forests, chordal, cactus, eventext{-}holetext{-}free, clawtext{-}free}}$, there are schedules that use $O(varepsilon^{-1})$ batches and incur only a $1+varepsilon$ multiplicative increase in cost over the best sequential schedules. Our main contribution is to compute such batch schedules in $O(varepsilon^{-1}log^* n)$ distributed time, which we also show to be tight. Further, we show that once we step out of these graph classes we face a very different situation. There are graph classes on which no efficient distributed algorithm can obtain the best (or almost best) existing schedule. Moreover, there are classes of bounded degree graphs which do not admit any reconfiguration schedules without incurring a large multiplicative increase in the cost at all.
Maximal independent set (MIS), maximal matching (MM), and $(Delta+1)$-coloring in graphs of maximum degree $Delta$ are among the most prominent algorithmic graph theory problems. They are all solvable by a simple linear-time greedy algorithm and up until very recently this constituted the state-of-the-art. In SODA 2019, Assadi, Chen, and Khanna gave a randomized algorithm for $(Delta+1)$-coloring that runs in $widetilde{O}(nsqrt{n})$ time, which even for moderately dense graphs is sublinear in the input size. The work of Assadi et al. however contained a spoiler for MIS and MM: neither problems provably admits a sublinear-time algorithm in general graphs. In this work, we dig deeper into the possibility of achieving sublinear-time algorithms for MIS and MM. The neighborhood independence number of a graph $G$, denoted by $beta(G)$, is the size of the largest independent set in the neighborhood of any vertex. We identify $beta(G)$ as the ``right parameter to measure the runtime of MIS and MM algorithms: Although graphs of bounded neighborhood independence may be very dense (clique is one example), we prove that carefully chosen variants of greedy algorithms for MIS and MM run in $O(nbeta(G))$ and $O(nlog{n}cdotbeta(G))$ time respectively on any $n$-vertex graph $G$. We complement this positive result by observing that a simple extension of the lower bound of Assadi et.al. implies that $Omega(nbeta(G))$ time is also necessary for any algorithm to either problem for all values of $beta(G)$ from $1$ to $Theta(n)$. We note that our algorithm for MIS is deterministic while for MM we use randomization which we prove is unavoidable: any deterministic algorithm for MM requires $Omega(n^2)$ time even for $beta(G) = 2$.
We introduce and study two natural generalizations of the Connected VertexCover (VC) problem: the $p$-Edge-Connected and $p$-Vertex-Connected VC problem (where $p geq 2$ is a fixed integer). Like Connected VC, both new VC problems are FPT, but do not admit a polynomial kernel unless $NP subseteq coNP/poly$, which is highly unlikely. We prove however that both problems admit time efficient polynomial sized approximate kernelization schemes. We obtain an $O(2^{O(pk)}n^{O(1)})$-time algorithm for the $p$-Edge-Connected VC and an $O(2^{O(k^2)}n^{O(1)})$-time algorithm for the $p$-Vertex-Connected VC. Finally, we describe a $2(p+1)$-approximation algorithm for the $p$-Edge-Connected VC. The proofs for the new VC problems require more sophisticated arguments than for Connected VC. In particular, for the approximation algorithm we use Gomory-Hu trees and for the approximate kernels a result on small-size spanning $p$-vertex/edge-connected subgraph of a $p$-vertex/edge-connected graph obtained independently by Nishizeki and Poljak (1994) and Nagamochi and Ibaraki (1992).
We present a massively parallel algorithm, with near-linear memory per machine, that computes a $(2+varepsilon)$-approximation of minimum-weight vertex cover in $O(loglog d)$ rounds, where $d$ is the average degree of the input graph. Our result fills the key remaining gap in the state-of-the-art MPC algorithms for vertex cover and matching problems; two classic optimization problems, which are duals of each other. Concretely, a recent line of work---by Czumaj et al. [STOC18], Ghaffari et al. [PODC18], Assadi et al. [SODA19], and Gamlath et al. [PODC19]---provides $O(loglog n)$ time algorithms for $(1+varepsilon)$-approximate maximum weight matching as well as for $(2+varepsilon)$-approximate minimum cardinality vertex cover. However, the latter algorithm does not work for the general weighted case of vertex cover, for which the best known algorithm remained at $O(log n)$ time complexity.