No Arabic abstract
We consider the following stochastic matching problem on both weighted and unweighted graphs: A graph $G(V, E)$ along with a parameter $p in (0, 1)$ is given in the input. Each edge of $G$ is realized independently with probability $p$. The goal is to select a degree bounded (dependent only on $p$) subgraph $H$ of $G$ such that the expected size/weight of maximum realized matching of $H$ is close to that of $G$. This model of stochastic matching has attracted significant attention over the recent years due to its various applications. The most fundamental open question is the best approximation factor achievable for such algorithms that, in the literature, are referred to as non-adaptive algorithms. Prior work has identified breaking (near) half-approximation as a barrier for both weighted and unweighted graphs. Our main results are as follows: -- We analyze a simple and clean algorithm and show that for unweighted graphs, it finds an (almost) $4sqrt{2}-5$ ($approx 0.6568$) approximation by querying $O(frac{log (1/p)}{p})$ edges per vertex. This improves over the state-of-the-art $0.5001$ approximate algorithm of Assadi et al. [EC17]. -- We show that the same algorithm achieves a $0.501$ approximation for weighted graphs by querying $O(frac{log (1/p)}{p})$ edges per vertex. This is the first algorithm to break $0.5$ approximation barrier for weighted graphs. It also improves the per-vertex queries of the state-of-the-art by Yamaguchi and Maehara [SODA18] and Behnezhad and Reyhani [EC18]. Our algorithms are fundamentally different from prior works, yet are very simple and natural. For the analysis, we introduce a number of procedures that construct heavy fractional matchings. We consider the new algorithms and our analytical tools to be the main contributions of this paper.
Suppose that we are given an arbitrary graph $G=(V, E)$ and know that each edge in $E$ is going to be realized independently with some probability $p$. The goal in the stochastic matching problem is to pick a sparse subgraph $Q$ of $G$ such that the realized edges in $Q$, in expectation, include a matching that is approximately as large as the maximum matching among the realized edges of $G$. The maximum degree of $Q$ can depend on $p$, but not on the size of $G$. This problem has been subject to extensive studies over the years and the approximation factor has been improved from $0.5$ to $0.5001$ to $0.6568$ and eventually to $2/3$. In this work, we analyze a natural sampling-based algorithm and show that it can obtain all the way up to $(1-epsilon)$ approximation, for any constant $epsilon > 0$. A key and of possible independent interest component of our analysis is an algorithm that constructs a matching on a stochastic graph, which among some other important properties, guarantees that each vertex is matched independently from the vertices that are sufficiently far. This allows us to bypass a previously known barrier towards achieving $(1-epsilon)$ approximation based on existence of dense Ruzsa-Szemeredi graphs.
Let $G$ be an $n$-vertex graph with $m$ edges. When asked a subset $S$ of vertices, a cut query on $G$ returns the number of edges of $G$ that have exactly one endpoint in $S$. We show that there is a bounded-error quantum algorithm that determines all connected components of $G$ after making $O(log(n)^6)$ many cut queries. In contrast, it follows from results in communication complexity that any randomized algorithm even just to decide whether the graph is connected or not must make at least $Omega(n/log(n))$ many cut queries. We further show that with $O(log(n)^8)$ many cut queries a quantum algorithm can with high probability output a spanning forest for $G$. En route to proving these results, we design quantum algorithms for learning a graph using cut queries. We show that a quantum algorithm can learn a graph with maximum degree $d$ after $O(d log(n)^2)$ many cut queries, and can learn a general graph with $O(sqrt{m} log(n)^{3/2})$ many cut queries. These two upper bounds are tight up to the poly-logarithmic factors, and compare to $Omega(dn)$ and $Omega(m/log(n))$ lower bounds on the number of cut queries needed by a randomized algorithm for the same problems, respectively. The key ingredients in our results are the Bernstein-Vazirani algorithm, approximate counting with OR queries, and learning sparse vectors from inner products as in compressed sensing.
The area of computing with uncertainty considers problems where some information about the input elements is uncertain, but can be obtained using queries. For example, instead of the weight of an element, we may be given an interval that is guaranteed to contain the weight, and a query can be performed to reveal the weight. While previous work has considered models where queries are asked either sequentially (adaptive model) or all at once (non-adaptive model), and the goal is to minimize the number of queries that are needed to solve the given problem, we propose and study a new model where $k$ queries can be made in parallel in each round, and the goal is to minimize the number of query rounds. We use competitive analysis and present upper and lower bounds on the number of query rounds required by any algorithm in comparison with the optimal number of query rounds. Given a set of uncertain elements and a family of $m$ subsets of that set, we present an algorithm for determining the value of the minimum of each of the subsets that requires at most $(2+varepsilon) cdot mathrm{opt}_k+mathrm{O}left(frac{1}{varepsilon} cdot lg mright)$ rounds for every $0<varepsilon<1$, where $mathrm{opt}_k$ is the optimal number of rounds, as well as nearly matching lower bounds. For the problem of determining the $i$-th smallest value and identifying all elements with that value in a set of uncertain elements, we give a $2$-round-competitive algorithm. We also show that the problem of sorting a family of sets of uncertain elements admits a $2$-round-competitive algorithm and this is the best possible.
Online bipartite matching with edge arrivals remained a major open question for a long time until a recent negative result by [Gamlath et al. FOCS 2019], who showed that no online policy is better than the straightforward greedy algorithm, i.e., no online algorithm has a worst-case competitive ratio better than $0.5$. In this work, we consider the bipartite matching problem with edge arrivals in a natural stochastic framework, i.e., Bayesian setting where each edge of the graph is independently realized according to a known probability distribution. We focus on a natural class of prune & greedy online policies motivated by practical considerations from a multitude of online matching platforms. Any prune & greedy algorithm consists of two stages: first, it decreases the probabilities of some edges in the stochastic instance and then runs greedy algorithm on the pruned graph. We propose prune & greedy algorithms that are $0.552$-competitive on the instances that can be pruned to a $2$-regular stochastic bipartite graph, and $0.503$-competitive on arbitrary bipartite graphs. The algorithms and our analysis significantly deviate from the prior work. We first obtain analytically manageable lower bound on the size of the matching, which leads to a non linear optimization problem. We further reduce this problem to a continuous optimization with a constant number of parameters that can be solved using standard software tools.
We consider several types of internal queries: questions about subwords of a text. As the main tool we develop an optimal data structure for the problem called here internal pattern matching. This data structure provides constant-time answers to queries about occurrences of one subword $x$ in another subword $y$ of a given text, assuming that $|y|=mathcal{O}(|x|)$, which allows for a constant-space representation of all occurrences. This problem can be viewed as a natural extension of the well-studied pattern matching problem. The data structure has linear size and admits a linear-time construction algorithm. Using the solution to the internal pattern matching problem, we obtain very efficient data structures answering queries about: primitivity of subwords, periods of subwords, general substring compression, and cyclic equivalence of two subwords. All these results improve upon the best previously known counterparts. The linear construction time of our data structure also allows to improve the algorithm for finding $delta$-subrepetitions in a text (a more general version of maximal repetitions, also called runs). For any fixed $delta$ we obtain the first linear-time algorithm, which matches the linear time complexity of the algorithm computing runs. Our data structure has already been used as a part of the efficient solutions for subword suffix rank & selection, as well as substring compression using Burrows-Wheeler transform composed with run-length encoding.