No Arabic abstract
The problem of solving linear systems is one of the most fundamental problems in computer science, where given a satisfiable linear system $(A,b)$, for $A in mathbb{R}^{n times n}$ and $b in mathbb{R}^n$, we wish to find a vector $x in mathbb{R}^n$ such that $Ax = b$. The current best algorithms for solving dense linear systems reduce the problem to matrix multiplication, and run in time $O(n^{omega})$. We consider the problem of finding $varepsilon$-approximate solutions to linear systems with respect to the $L_2$-norm, that is, given a satisfiable linear system $(A in mathbb{R}^{n times n}, b in mathbb{R}^n)$, find an $x in mathbb{R}^n$ such that $||Ax - b||_2 leq varepsilon||b||_2$. Our main result is a fine-grained reduction from computing the rank of a matrix to finding $varepsilon$-approximate solutions to linear systems. In particular, if the best known $O(n^omega)$ time algorithm for computing the rank of $n times O(n)$ matrices is optimal (which we conjecture is true), then finding an $varepsilon$-approximate solution to a dense linear system also requires $tilde{Omega}(n^{omega})$ time, even for $varepsilon$ as large as $(1 - 1/text{poly}(n))$. We also prove (under some modified conjectures for the rank-finding problem) optimal hardness of approximation for sparse linear systems, linear systems over positive semidefinite matrices, well-conditioned linear systems, and approximately solving linear systems with respect to the $L_p$-norm, for $p geq 1$. At the heart of our results is a novel reduction from the rank problem to a decision version of the approximate linear systems problem. This reduction preserves properties such as matrix sparsity and bit complexity.
The minimum linear ordering problem (MLOP) seeks to minimize an aggregated cost $f(cdot)$ due to an ordering $sigma$ of the items (say $[n]$), i.e., $min_{sigma} sum_{iin [n]} f(E_{i,sigma})$, where $E_{i,sigma}$ is the set of items that are mapped by $sigma$ to indices at most $i$. This problem has been studied in the literature for various special cases of the cost function $f$, and in a general setting for a submodular or supermodular cost $f$ [ITT2012]. Though MLOP was known to be NP-hard for general submodular functions, it was unknown whether the special case of graphic matroid MLOP (with $f$ being the matroid rank function of a graph) was polynomial-time solvable. Following this motivation, we explore related classes of linear ordering problems, including symmetric submodular MLOP, minimum latency vertex cover, and minimum sum vertex cover. We show that the most special cases of our problem, graphic matroid MLOP and minimum latency vertex cover, are both NP-hard. We further expand the toolkit for approximating MLOP variants: using the theory of principal partitions, we show a $2-frac{1+ell_{f}}{1+|E|}$ approximation to monotone submodular MLOP, where $ell_{f}=frac{f(E)}{max_{xin E}f({x})}$ satisfies $1 leq ell_f leq |E|$. Thus our result improves upon the best known bound of $2-frac{2}{1+|E|}$ by Iwata, Tetali, and Tripathi [ITT2012]. This leads to a $2-frac{1+r(E)}{1+|E|}$ approximation for the matroid MLOP, corresponding to the case when $r$ is the rank function of a given matroid. Finally, we show that MLVC can be $4/3$ approximated, matching the integrality gap of its vanilla LP relaxation.
We initiate the study of fine-grained completeness theorems for exact and approximate optimization in the polynomial-time regime. Inspired by the first completeness results for decision problems in P (Gao, Impagliazzo, Kolokolova, Williams, TALG 2019) as well as the classic class MaxSNP and MaxSNP-completeness for NP optimization problems (Papadimitriou, Yannakakis, JCSS 1991), we define polynomial-time analogues MaxSP and MinSP, which contain a number of natural optimization problems in P, including Maximum Inner Product, general forms of nearest neighbor search and optimization variants of the $k$-XOR problem. Specifically, we define MaxSP as the class of problems definable as $max_{x_1,dots,x_k} #{ (y_1,dots,y_ell) : phi(x_1,dots,x_k, y_1,dots,y_ell) }$, where $phi$ is a quantifier-free first-order property over a given relational structure (with MinSP defined analogously). On $m$-sized structures, we can solve each such problem in time $O(m^{k+ell-1})$. Our results are: - We determine (a sparse variant of) the Maximum/Minimum Inner Product problem as complete under *deterministic* fine-grained reductions: A strongly subquadratic algorithm for Maximum/Minimum Inner Product would beat the baseline running time of $O(m^{k+ell-1})$ for *all* problems in MaxSP/MinSP by a polynomial factor. - This completeness transfers to approximation: Maximum/Minimum Inner Product is also complete in the sense that a strongly subquadratic $c$-approximation would give a $(c+varepsilon)$-approximation for all MaxSP/MinSP problems in time $O(m^{k+ell-1-delta})$, where $varepsilon > 0$ can be chosen arbitrarily small. Combining our completeness with~(Chen, Williams, SODA 2019), we obtain the perhaps surprising consequence that refuting the OV Hypothesis is *equivalent* to giving a $O(1)$-approximation for all MinSP problems in faster-than-$O(m^{k+ell-1})$ time.
Maintaining and updating shortest paths information in a graph is a fundamental problem with many applications. As computations on dense graphs can be prohibitively expensive, and it is preferable to perform the computations on a sparse skeleton of the given graph that roughly preserves the shortest paths information. Spanners and emulators serve this purpose. This paper develops fast dynamic algorithms for sparse spanner and emulator maintenance and provides evidence from fine-grained complexity that these algorithms are tight. Under the popular OMv conjecture, we show that there can be no decremental or incremental algorithm that maintains an $n^{1+o(1)}$ edge (purely additive) $+n^{delta}$-emulator for any $delta<1/2$ with arbitrary polynomial preprocessing time and total update time $m^{1+o(1)}$. Also, under the Combinatorial $k$-Clique hypothesis, any fully dynamic combinatorial algorithm that maintains an $n^{1+o(1)}$ edge $(1+epsilon,n^{o(1)})$-spanner or emulator must either have preprocessing time $mn^{1-o(1)}$ or amortized update time $m^{1-o(1)}$. Both of our conditional lower bounds are tight. As the above fully dynamic lower bound only applies to combinatorial algorithms, we also develop an algebraic spanner algorithm that improves over the $m^{1-o(1)}$ update time for dense graphs. For any constant $epsilonin (0,1]$, there is a fully dynamic algorithm with worst-case update time $O(n^{1.529})$ that whp maintains an $n^{1+o(1)}$ edge $(1+epsilon,n^{o(1)})$-spanner. Our new algebraic techniques and spanner algorithms allow us to also obtain (1) a new fully dynamic algorithm for All-Pairs Shortest Paths (APSP) with update and path query time $O(n^{1.9})$; (2) a fully dynamic $(1+epsilon)$-approximate APSP algorithm with update time $O(n^{1.529})$; (3) a fully dynamic algorithm for near-$2$-approximate Steiner tree maintenance.
Given a graph where every vertex has exactly one labeled token, how can we most quickly execute a given permutation on the tokens? In (sequential) token swapping, the goal is to use the shortest possible sequence of swaps, each of which exchanges the tokens at the two endpoints of an edge of the graph. In parallel token swapping, the goal is to use the fewest rounds, each of which consists of one or more swaps on the edges of a matching. We prove that both of these problems remain NP-hard when the graph is restricted to be a tree. These token swapping problems have been studied by disparate groups of researchers in discrete mathematics, theoretical computer science, robot motion planning, game theory, and engineering. Previous work establishes NP-completeness on general graphs (for both problems); polynomial-time algorithms for simple graph classes such as cliques, stars, paths, and cycles; and constant-factor approximation algorithms in some cases. The two natural cases of sequential and parallel token swapping in trees were first studied over thirty years ago (as sorting with a transposition tree) and over twenty-five years ago (as routing permutations via matchings), yet their complexities were previously unknown. We also show limitations on approximation of sequential token swapping on trees: we identify a broad class of algorithms that encompass all three known polynomial-time algorithms that achieve the best known approximation factor (which is $2$) and show that no such algorithm can achieve an approximation factor less than $2$.
This paper initiates the study of I/O algorithms (minimizing cache misses) from the perspective of fine-grained complexity (conditional polynomial lower bounds). Specifically, we aim to answer why sparse graph problems are so hard, and why the Longest Common Subsequence problem gets a savings of a factor of the size of cache times the length of a cache line, but no more. We take the reductions and techniques from complexity and fine-grained complexity and apply them to the I/O model to generate new (conditional) lower bounds as well as faster algorithms. We also prove the existence of a time hierarchy for the I/O model, which motivates the fine-grained reductions. Using fine-grained reductions, we give an algorithm for distinguishing 2 vs. 3 diameter and radius that runs in $O(|E|^2/(MB))$ cache misses, which for sparse graphs improves over the previous $O(|V|^2/B)$ running time. We give new reductions from radius and diameter to Wiener index and median. We show meaningful reductions between problems that have linear-time solutions in the RAM model. The reductions use low I/O complexity (typically $O(n/B)$), and thus help to finely capture the relationship between I/O linear time $Theta(n/B)$ and RAM linear time $Theta(n)$. We generate new I/O assumptions based on the difficulty of improving sparse graph problem running times in the I/O model. We create conjectures that the current best known algorithms for Single Source Shortest Paths (SSSP), diameter, and radius are optimal. From these I/O-model assumptions, we show that many of the known reductions in the word-RAM model can naturally extend to hold in the I/O model as well (e.g., a lower bound on the I/O complexity of Longest Common Subsequence that matches the best known running time). Finally, we prove an analog of the Time Hierarchy Theorem in the I/O model.