No Arabic abstract
We give a quasipolynomial time algorithm for the graph matching problem (also known as noisy or robust graph isomorphism) on correlated random graphs. Specifically, for every $gamma>0$, we give a $n^{O(log n)}$ time algorithm that given a pair of $gamma$-correlated $G(n,p)$ graphs $G_0,G_1$ with average degree between $n^{varepsilon}$ and $n^{1/153}$ for $varepsilon = o(1)$, recovers the ground truth permutation $piin S_n$ that matches the vertices of $G_0$ to the vertices of $G_n$ in the way that minimizes the number of mismatched edges. We also give a recovery algorithm for a denser regime, and a polynomial-time algorithm for distinguishing between correlated and uncorrelated graphs. Prior work showed that recovery is information-theoretically possible in this model as long the average degree was at least $log n$, but sub-exponential time algorithms were only known in the dense case (i.e., for $p > n^{-o(1)}$). Moreover, Percolation Graph Matching, which is the most common heuristic for this problem, has been shown to require knowledge of $n^{Omega(1)}$ seeds (i.e., input/output pairs of the permutation $pi$) to succeed in this regime. In contrast our algorithms require no seed and succeed for $p$ which is as low as $n^{o(1)-1}$.
Let $A$ and $B$ be two point sets in the plane of sizes $r$ and $n$ respectively (assume $r leq n$), and let $k$ be a parameter. A matching between $A$ and $B$ is a family of pairs in $A times B$ so that any point of $A cup B$ appears in at most one pair. Given two positive integers $p$ and $q$, we define the cost of matching $M$ to be $c(M) = sum_{(a, b) in M}|{a-b}|_p^q$ where $|{cdot}|_p$ is the $L_p$-norm. The geometric partial matching problem asks to find the minimum-cost size-$k$ matching between $A$ and $B$. We present efficient algorithms for geometric partial matching problem that work for any powers of $L_p$-norm matching objective: An exact algorithm that runs in $O((n + k^2) {mathop{mathrm{polylog}}} n)$ time, and a $(1 + varepsilon)$-approximation algorithm that runs in $O((n + ksqrt{k}) {mathop{mathrm{polylog}}} n cdot logvarepsilon^{-1})$ time. Both algorithms are based on the primal-dual flow augmentation scheme; the main improvements involve using dynamic data structures to achieve efficient flow augmentations. With similar techniques, we give an exact algorithm for the planar transportation problem running in $O(min{n^2, rn^{3/2}} {mathop{mathrm{polylog}}} n)$ time.
We present an $tilde O(m+n^{1.5})$-time randomized algorithm for maximum cardinality bipartite matching and related problems (e.g. transshipment, negative-weight shortest paths, and optimal transport) on $m$-edge, $n$-node graphs. For maximum cardinality bipartite matching on moderately dense graphs, i.e. $m = Omega(n^{1.5})$, our algorithm runs in time nearly linear in the input size and constitutes the first improvement over the classic $O(msqrt{n})$-time [Dinic 1970; Hopcroft-Karp 1971; Karzanov 1973] and $tilde O(n^omega)$-time algorithms [Ibarra-Moran 1981] (where currently $omegaapprox 2.373$). On sparser graphs, i.e. when $m = n^{9/8 + delta}$ for any constant $delta>0$, our result improves upon the recent advances of [Madry 2013] and [Liu-Sidford 2020b, 2020a] which achieve an $tilde O(m^{4/3+o(1)})$ runtime. We obtain these results by combining and advancing recent lines of research in interior point methods (IPMs) and dynamic graph algorithms. First, we simplify and improve the IPM of [v.d.Brand-Lee-Sidford-Song 2020], providing a general primal-dual IPM framework and new sampling-based techniques for handling infeasibility induced by approximate linear system solvers. Second, we provide a simple sublinear-time algorithm for detecting and sampling high-energy edges in electric flows on expanders and show that when combined with recent advances in dynamic expander decompositions, this yields efficient data structures for maintaining the iterates of both [v.d.Brand et al.] and our new IPMs. Combining this general machinery yields a simpler $tilde O(n sqrt{m})$ time algorithm for matching based on the logarithmic barrier function, and our state-of-the-art $tilde O(m+n^{1.5})$ time algorithm for matching based on the [Lee-Sidford 2014] barrier (as regularized in [v.d.Brand et al.]).
We study the problem of learning the causal relationships between a set of observed variables in the presence of latents, while minimizing the cost of interventions on the observed variables. We assume access to an undirected graph $G$ on the observed variables whose edges represent either all direct causal relationships or, less restrictively, a superset of causal relationships (identified, e.g., via conditional independence tests or a domain expert). Our goal is to recover the directions of all causal or ancestral relations in $G$, via a minimum cost set of interventions. It is known that constructing an exact minimum cost intervention set for an arbitrary graph $G$ is NP-hard. We further argue that, conditioned on the hardness of approximate graph coloring, no polynomial time algorithm can achieve an approximation factor better than $Theta(log n)$, where $n$ is the number of observed variables in $G$. To overcome this limitation, we introduce a bi-criteria approximation goal that lets us recover the directions of all but $epsilon n^2$ edges in $G$, for some specified error parameter $epsilon > 0$. Under this relaxed goal, we give polynomial time algorithms that achieve intervention cost within a small constant factor of the optimal. Our algorithms combine work on efficient intervention design and the design of low-cost separating set systems, with ideas from the literature on graph property testing.
This paper studies the problem of recovering the hidden vertex correspondence between two edge-correlated random graphs. We focus on the Gaussian model where the two graphs are complete graphs with correlated Gaussian weights and the ErdH{o}s-Renyi model where the two graphs are subsampled from a common parent ErdH{o}s-Renyi graph $mathcal{G}(n,p)$. For dense graphs with $p=n^{-o(1)}$, we prove that there exists a sharp threshold, above which one can correctly match all but a vanishing fraction of vertices and below which correctly matching any positive fraction is impossible, a phenomenon known as the all-or-nothing phase transition. Even more strikingly, in the Gaussian setting, above the threshold all vertices can be exactly matched with high probability. In contrast, for sparse ErdH{o}s-Renyi graphs with $p=n^{-Theta(1)}$, we show that the all-or-nothing phenomenon no longer holds and we determine the thresholds up to a constant factor. Along the way, we also derive the sharp threshold for exact recovery, sharpening the existing results in ErdH{o}s-Renyi graphs. The proof of the negative results builds upon a tight characterization of the mutual information based on the truncated second-moment computation and an area theorem that relates the mutual information to the integral of the reconstruction error. The positive results follows from a tight analysis of the maximum likelihood estimator that takes into account the cycle structure of the induced permutation on the edges.
Given a graph $G=(V,E)$ with two distinguished vertices $s,tin V$ and an integer parameter $L>0$, an {em $L$-bounded cut} is a subset $F$ of edges (vertices) such that the every path between $s$ and $t$ in $Gsetminus F$ has length more than $L$. The task is to find an $L$-bounded cut of minimum cardinality. Though the problem is very simple to state and has been studied since the beginning of the 70s, it is not much understood yet. The problem is known to be $cal{NP}$-hard to approximate within a small constant factor even for $Lgeq 4$ (for $Lgeq 5$ for the vertex cuts). On the other hand, the best known approximation algorithm for general graphs has approximation ratio only $mathcal{O}({n^{2/3}})$ in the edge case, and $mathcal{O}({sqrt{n}})$ in the vertex case, where $n$ denotes the number of vertices. We show that for planar graphs, it is possible to solve both the edge- and the vertex-version of the problem optimally in time $mathcal{O}(L^{3L}n)$. That is, the problem is fixed parameter tractable (FPT) with respect to $L$ on planar graphs. Furthermore, we show that the problem remains FPT even for bounded genus graphs, a super class of planar graphs. Our second contribution deals with approximations of the vertex version of the problem. We describe an algorithm that for a given a graph $G$, its tree decomposition of treewidth $tau$ and vertices $s$ and $t$ computes a $tau$-approximation of the minimum $L$-bounded $s-t$ vertex cut; if the decomposition is not given, then the approximation ratio is $mathcal{O}(tau sqrt{log tau})$. For graphs with treewidth bounded by $mathcal{O}(n^{1/2-epsilon})$ for any $epsilon>0$, but not by a constant, this is the best approximation in terms of~$n$ that we are aware of.