No Arabic abstract
Given two strings $S$ and $P$, the Episode Matching problem is to compute the length of the shortest substring of $S$ that contains $P$ as a subsequence. The best known upper bound for this problem is $tilde O(nm)$ by Das et al. (1997), where $n,m$ are the lengths of $S$ and $P$, respectively. Although the problem is well studied and has many applications in data mining, this bound has never been improved. In this paper we show why this is the case by proving that an $O((nm)^{1-epsilon})$ algorithm (even for binary strings) would refute the popular Strong Exponential Time Hypothesis (SETH). The proof is based on a simple reduction from Orthogonal Vectors.
We prove a lower bound on the space complexity of two-pass semi-streaming algorithms that approximate the maximum matching problem. The lower bound is parameterized by the density of Ruzsa-Szemeredi graphs: * Any two-pass semi-streaming algorithm for maximum matching has approximation ratio at least $(1- Omega(frac{log{RS(n)}}{log{n}}))$, where $RS(n)$ denotes the maximum number of induced matchings of size $Theta(n)$ in any $n$-vertex graph, i.e., the largest density of a Ruzsa-Szemeredi graph. Currently, it is known that $n^{Omega(1/!loglog{n})} leq RS(n) leq frac{n}{2^{O(log^*{!(n)})}}$ and closing this (large) gap between upper and lower bounds has remained a notoriously difficult problem in combinatorics. Under the plausible hypothesis that $RS(n) = n^{Omega(1)}$, our lower bound is the first to rule out small-constant approximation two-pass semi-streaming algorithms for the maximum matching problem, making progress on a longstanding open question in the graph streaming literature.
For even $k$, the matchings connectivity matrix $mathbf{M}_k$ encodes which pairs of perfect matchings on $k$ vertices form a single cycle. Cygan et al. (STOC 2013) showed that the rank of $mathbf{M}_k$ over $mathbb{Z}_2$ is $Theta(sqrt 2^k)$ and used this to give an $O^*((2+sqrt{2})^{mathsf{pw}})$ time algorithm for counting Hamiltonian cycles modulo $2$ on graphs of pathwidth $mathsf{pw}$. The same authors complemented their algorithm by an essentially tight lower bound under the Strong Exponential Time Hypothesis (SETH). This bound crucially relied on a large permutation submatrix within $mathbf{M}_k$, which enabled a pattern propagation commonly used in previous related lower bounds, as initiated by Lokshtanov et al. (SODA 2011). We present a new technique for a similar pattern propagation when only a black-box lower bound on the asymptotic rank of $mathbf{M}_k$ is given; no stronger structural insights such as the existence of large permutation submatrices in $mathbf{M}_k$ are needed. Given appropriate rank bounds, our technique yields lower bounds for counting Hamiltonian cycles (also modulo fixed primes $p$) parameterized by pathwidth. To apply this technique, we prove that the rank of $mathbf{M}_k$ over the rationals is $4^k / mathrm{poly}(k)$. We also show that the rank of $mathbf{M}_k$ over $mathbb{Z}_p$ is $Omega(1.97^k)$ for any prime $p eq 2$ and even $Omega(2.15^k)$ for some primes. As a consequence, we obtain that Hamiltonian cycles cannot be counted in time $O^*((6-epsilon)^{mathsf{pw}})$ for any $epsilon>0$ unless SETH fails. This bound is tight due to a $O^*(6^{mathsf{pw}})$ time algorithm by Bodlaender et al. (ICALP 2013). Under SETH, we also obtain that Hamiltonian cycles cannot be counted modulo primes $p eq 2$ in time $O^*(3.97^mathsf{pw})$, indicating that the modulus can affect the complexity in intricate ways.
The minimum degree algorithm is one of the most widely-used heuristics for reducing the cost of solving large sparse systems of linear equations. It has been studied for nearly half a century and has a rich history of bridging techniques from data structures, graph algorithms, and scientific computing. In this paper, we present a simple but novel combinatorial algorithm for computing an exact minimum degree elimination ordering in $O(nm)$ time, which improves on the best known time complexity of $O(n^3)$ and offers practical improvements for sparse systems with small values of $m$. Our approach leverages a careful amortized analysis, which also allows us to derive output-sensitive bounds for the running time of $O(min{msqrt{m^+}, Delta m^+} log n)$, where $m^+$ is the number of unique fill edges and original edges that the algorithm encounters and $Delta$ is the maximum degree of the input graph. Furthermore, we show there cannot exist an exact minimum degree algorithm that runs in $O(nm^{1-varepsilon})$ time, for any $varepsilon > 0$, assuming the strong exponential time hypothesis. This fine-grained reduction goes through the orthogonal vectors problem and uses a new low-degree graph construction called $U$-fillers, which act as pathological inputs and cause any minimum degree algorithm to exhibit nearly worst-case performance. With these two results, we nearly characterize the time complexity of computing an exact minimum degree ordering.
For online matching with the line metric, we present a lower bound of $Omega(log n)$ on the approximation ratio of any online (possibly randomized) algorithm. This beats the previous best lower bound of $Omega(sqrt{log n})$ and matches the known upper bound of $O(log n)$.
We show that an improvement to the best known quantum lower bound for GRAPH-COLLISION problem implies an improvement to the best known lower bound for TRIANGLE problem in the quantum query complexity model. In GRAPH-COLLISION we are given free access to a graph $(V,E)$ and access to a function $f:Vrightarrow {0,1}$ as a black box. We are asked to determine if there exist $(u,v) in E$, such that $f(u)=f(v)=1$. In TRIANGLE we have a black box access to an adjacency matrix of a graph and we have to determine if the graph contains a triangle. For both of these problems the known lower bounds are trivial ($Omega(sqrt{n})$ and $Omega(n)$, respectively) and there is no known matching upper bound.