No Arabic abstract
We study the maximum matching problem in the random-order semi-streaming setting. In this problem, the edges of an arbitrary $n$-vertex graph $G=(V, E)$ arrive in a stream one by one and in a random order. The goal is to have a single pass over the stream, use $n cdot poly(log n)$ space, and output a large matching of $G$. We prove that for an absolute constant $epsilon_0 > 0$, one can find a $(2/3 + epsilon_0)$-approximate maximum matching of $G$ using $O(n log n)$ space with high probability. This breaks the natural boundary of $2/3$ for this problem prevalent in the prior work and resolves an open problem of Bernstein [ICALP20] on whether a $(2/3 + Omega(1))$-approximation is achievable.
The maximum matching problem in dynamic graphs subject to edge updates (insertions and deletions) has received much attention over the last few years; a multitude of approximation/time tradeoffs were obtained, improving upon the folklore algorithm, which maintains a maximal (and hence $2$-approximate) matching in $O(n)$ worst-case update time in $n$-node graphs. We present the first deterministic algorithm which outperforms the folklore algorithm in terms of {em both} approximation ratio and worst-case update time. Specifically, we give a $(2-Omega(1))$-approximate algorithm with $O(sqrt{n}sqrt[8]{m})=O(n^{3/4})$ worst-case update time in $n$-node, $m$-edge graphs. For sufficiently small constant $epsilon>0$, no deterministic $(2+epsilon)$-approximate algorithm with worst-case update time $O(n^{0.99})$ was known. Our second result is the first deterministic $(2+epsilon)$-approximate (weighted) matching algorithm with $O_epsilon(1)cdot O(sqrt[4]{m}) = O_epsilon(1)cdot O(sqrt{n})$ worst-case update time.
We prove a lower bound on the space complexity of two-pass semi-streaming algorithms that approximate the maximum matching problem. The lower bound is parameterized by the density of Ruzsa-Szemeredi graphs: * Any two-pass semi-streaming algorithm for maximum matching has approximation ratio at least $(1- Omega(frac{log{RS(n)}}{log{n}}))$, where $RS(n)$ denotes the maximum number of induced matchings of size $Theta(n)$ in any $n$-vertex graph, i.e., the largest density of a Ruzsa-Szemeredi graph. Currently, it is known that $n^{Omega(1/!loglog{n})} leq RS(n) leq frac{n}{2^{O(log^*{!(n)})}}$ and closing this (large) gap between upper and lower bounds has remained a notoriously difficult problem in combinatorics. Under the plausible hypothesis that $RS(n) = n^{Omega(1)}$, our lower bound is the first to rule out small-constant approximation two-pass semi-streaming algorithms for the maximum matching problem, making progress on a longstanding open question in the graph streaming literature.
We study the problem of computing an approximate maximum cardinality matching in the semi-streaming model when edges arrive in a emph{random} order. In the semi-streaming model, the edges of the input graph G = (V,E) are given as a stream e_1, ..., e_m, and the algorithm is allowed to make a single pass over this stream while using $O(n textrm{polylog}(n))$ space ($m = |E|$ and $n = |V|$). If the order of edges is adversarial, a simple single-pass greedy algorithm yields a $1/2$-approximation in $O(n)$ space; achieving a better approximation in adversarial streams remains an elusive open question. A line of recent work shows that one can improve upon the $1/2$-approximation if the edges of the stream arrive in a random order. The state of the art for this model is two-fold: Assadi et al. [SODA 2019] show how to compute a $2/3(sim.66)$-approximate matching, but the space requirement is $O(n^{1.5} textrm{polylog}(n))$. Very recently, Farhadi et al. [SODA 2020] presented an algorithm with the desired space usage of $O(n textrm{polylog}(n))$, but a worse approximation ratio of $6/11(sim.545)$, or $3/5(=.6)$ in bipartite graphs. In this paper, we present an algorithm that computes a $2/3(sim.66)$-approximate matching using only $O(n log(n))$ space, improving upon both results above. We also note that for adversarial streams, a lower bound of Kapralov [SODA 2013] shows that any algorithm that achieves a $1-1/e(sim.63)$-approximation requires $(n^{1+Omega(1/loglog(n))})$ space. Our result for random-order streams is the first to go beyond the adversarial-order lower bound, thus establishing that computing a maximum matching is provably easier in random-order streams.
We introduce a weighted version of the ranking algorithm by Karp et al. (STOC 1990), and prove a competitive ratio of 0.6534 for the vertex-weighted online bipartite matching problem when online vertices arrive in random order. Our result shows that random arrivals help beating the 1-1/e barrier even in the vertex-weighted case. We build on the randomized primal-dual framework by Devanur et al. (SODA 2013) and design a two dimensional gain sharing function, which depends not only on the rank of the offline vertex, but also on the arrival time of the online vertex. To our knowledge, this is the first competitive ratio strictly larger than 1-1/e for an online bipartite matching problem achieved under the randomized primal-dual framework. Our algorithm has a natural interpretation that offline vertices offer a larger portion of their weights to the online vertices as time goes by, and each online vertex matches the neighbor with the highest offer at its arrival.
In this work, we study longest common substring, pattern matching, and wildcard pattern matching in the asymmetric streaming model. In this streaming model, we have random access to one string and streaming access to the other one. We present streaming algorithms with provable guarantees for these three fundamental problems. In particular, our algorithms for pattern matching improve the upper bound and beat the unconditional lower bounds on the memory of randomized and deterministic streaming algorithms. In addition to this, we present algorithms for wildcard pattern matching in the asymmetric streaming model that have optimal space and time.