No Arabic abstract
We consider the problem of finding textit{semi-matching} in bipartite graphs which is also extensively studied under various names in the scheduling literature. We give faster algorithms for both weighted and unweighted case. For the weighted case, we give an $O(nmlog n)$-time algorithm, where $n$ is the number of vertices and $m$ is the number of edges, by exploiting the geometric structure of the problem. This improves the classical $O(n^3)$ algorithms by Horn [Operations Research 1973] and Bruno, Coffman and Sethi [Communications of the ACM 1974]. For the unweighted case, the bound could be improved even further. We give a simple divide-and-conquer algorithm which runs in $O(sqrt{n}mlog n)$ time, improving two previous $O(nm)$-time algorithms by Abraham [MSc thesis, University of Glasgow 2003] and Harvey, Ladner, Lovasz and Tamir [WADS 2003 and Journal of Algorithms 2006]. We also extend this algorithm to solve the textit{Balance Edge Cover} problem in $O(sqrt{n}mlog n)$ time, improving the previous $O(nm)$-time algorithm by Harada, Ono, Sadakane and Yamashita [ISAAC 2008].
Consider a planar graph $G=(V,E)$ with polynomially bounded edge weight function $w:Eto [0, poly(n)]$. The main results of this paper are NC algorithms for the following problems: - minimum weight perfect matching in $G$, - maximum cardinality and maximum weight matching in $G$ when $G$ is bipartite, - maximum multiple-source multiple-sink flow in $G$ where $c:Eto [1, poly(n)]$ is a polynomially bounded edge capacity function, - minimum weight $f$-factor in $G$ where $f:Vto [1, poly(n)]$, - min-cost flow in $G$ where $c:Eto [1, poly(n)]$ is a polynomially bounded edge capacity function and $b:Vto [1, poly(n)]$ is a polynomially bounded vertex demand function. There have been no known NC algorithms for any of these problems previously (Before this and independent paper by Anari and Vazirani). In order to solve these problems we develop a new relatively simple but versatile framework that is combinatorial in spirit. It handles the combinatorial structure of matchings directly and needs to only know weights of appropriately defined matchings from algebraic subroutines.
The study of approximate matching in the Massively Parallel Computations (MPC) model has recently seen a burst of breakthroughs. Despite this progress, however, we still have a far more limited understanding of maximal matching which is one of the central problems of parallel and distributed computing. All known MPC algorithms for maximal matching either take polylogarithmic time which is considered inefficient, or require a strictly super-linear space of $n^{1+Omega(1)}$ per machine. In this work, we close this gap by providing a novel analysis of an extremely simple algorithm a variant of which was conjectured to work by Czumaj et al. [STOC18]. The algorithm edge-samples the graph, randomly partitions the vertices, and finds a random greedy maximal matching within each partition. We show that this algorithm drastically reduces the vertex degrees. This, among some other results, leads to an $O(log log Delta)$ round algorithm for maximal matching with $O(n)$ space (or even mildly sublinear in $n$ using standard techniques). As an immediate corollary, we get a $2$ approximate minimum vertex cover in essentially the same rounds and space. This is the best possible approximation factor under standard assumptions, culminating a long line of research. It also leads to an improved $O(loglog Delta)$ round algorithm for $1 + varepsilon$ approximate matching. All these results can also be implemented in the congested clique model within the same number of rounds.
This paper presents universal algorithms for clustering problems, including the widely studied $k$-median, $k$-means, and $k$-center objectives. The input is a metric space containing all potential client locations. The algorithm must select $k$ cluster centers such that they are a good solution for any subset of clients that actually realize. Specifically, we aim for low regret, defined as the maximum over all subsets of the difference between the cost of the algorithms solution and that of an optimal solution. A universal algorithms solution $SOL$ for a clustering problem is said to be an $(alpha, beta)$-approximation if for all subsets of clients $C$, it satisfies $SOL(C) leq alpha cdot OPT(C) + beta cdot MR$, where $OPT(C)$ is the cost of the optimal solution for clients $C$ and $MR$ is the minimum regret achievable by any solution. Our main results are universal algorithms for the standard clustering objectives of $k$-median, $k$-means, and $k$-center that achieve $(O(1), O(1))$-approximations. These results are obtained via a novel framework for universal algorithms using linear programming (LP) relaxations. These results generalize to other $ell_p$-objectives and the setting where some subset of the clients are fixed. We also give hardness results showing that $(alpha, beta)$-approximation is NP-hard if $alpha$ or $beta$ is at most a certain constant, even for the widely studied special case of Euclidean metric spaces. This shows that in some sense, $(O(1), O(1))$-approximation is the strongest type of guarantee obtainable for universal clustering.
Let $A$ and $B$ be two point sets in the plane of sizes $r$ and $n$ respectively (assume $r leq n$), and let $k$ be a parameter. A matching between $A$ and $B$ is a family of pairs in $A times B$ so that any point of $A cup B$ appears in at most one pair. Given two positive integers $p$ and $q$, we define the cost of matching $M$ to be $c(M) = sum_{(a, b) in M}|{a-b}|_p^q$ where $|{cdot}|_p$ is the $L_p$-norm. The geometric partial matching problem asks to find the minimum-cost size-$k$ matching between $A$ and $B$. We present efficient algorithms for geometric partial matching problem that work for any powers of $L_p$-norm matching objective: An exact algorithm that runs in $O((n + k^2) {mathop{mathrm{polylog}}} n)$ time, and a $(1 + varepsilon)$-approximation algorithm that runs in $O((n + ksqrt{k}) {mathop{mathrm{polylog}}} n cdot logvarepsilon^{-1})$ time. Both algorithms are based on the primal-dual flow augmentation scheme; the main improvements involve using dynamic data structures to achieve efficient flow augmentations. With similar techniques, we give an exact algorithm for the planar transportation problem running in $O(min{n^2, rn^{3/2}} {mathop{mathrm{polylog}}} n)$ time.
For over a decade now we have been witnessing the success of {em massive parallel computation} (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. One of the reasons for their success is the fact that these frameworks are able to accurately capture the nature of large-scale computation. In particular, compared to the classic distributed algorithms or PRAM models, these frameworks allow for much more local computation. The fundamental question that arises in this context is though: can we leverage this additional power to obtain even faster parallel algorithms? A prominent example here is the {em maximum matching} problem---one of the most classic graph problems. It is well known that in the PRAM model one can compute a 2-approximate maximum matching in $O(log{n})$ rounds. However, the exact complexity of this problem in the MPC framework is still far from understood. Lattanzi et al. showed that if each machine has $n^{1+Omega(1)}$ memory, this problem can also be solved $2$-approximately in a constant number of rounds. These techniques, as well as the approaches developed in the follow up work, seem though to get stuck in a fundamental way at roughly $O(log{n})$ rounds once we enter the near-linear memory regime. It is thus entirely possible that in this regime, which captures in particular the case of sparse graph computations, the best MPC round complexity matches what one can already get in the PRAM model, without the need to take advantage of the extra local computation power. In this paper, we finally refute that perplexing possibility. That is, we break the above $O(log n)$ round complexity bound even in the case of {em slightly sublinear} memory per machine. In fact, our improvement here is {em almost exponential}: we are able to deliver a $(2+epsilon)$-approximation to maximum matching, for any fixed constant $epsilon>0$, in $O((log log n)^2)$ rounds.