No Arabic abstract
We study the classic set cover problem from the perspective of sub-linear algorithms. Given access to a collection of $m$ sets over $n$ elements in the query model, we show that sub-linear algorithms derived from existing techniques have almost tight query complexities. On one hand, first we show an adaptation of the streaming algorithm presented in Har-Peled et al. [2016] to the sub-linear query model, that returns an $alpha$-approximate cover using $tilde{O}(m(n/k)^{1/(alpha-1)} + nk)$ queries to the input, where $k$ denotes the value of a minimum set cover. We then complement this upper bound by proving that for lower values of $k$, the required number of queries is $tilde{Omega}(m(n/k)^{1/(2alpha)})$, even for estimating the optimal cover size. Moreover, we prove that even checking whether a given collection of sets covers all the elements would require $Omega(nk)$ queries. These two lower bounds provide strong evidence that the upper bound is almost tight for certain values of the parameter $k$. On the other hand, we show that this bound is not optimal for larger values of the parameter $k$, as there exists a $(1+varepsilon)$-approximation algorithm with $tilde{O}(mn/kvarepsilon^2)$ queries. We show that this bound is essentially tight for sufficiently small constant $varepsilon$, by establishing a lower bound of $tilde{Omega}(mn/k)$ query complexity.
We study the generalized min sum set cover (GMSSC) problem, wherein given a collection of hyperedges $E$ with arbitrary covering requirements $k_e$, the goal is to find an ordering of the vertices to minimize the total cover time of the hyperedges; a hyperedge $e$ is considered covered by the first time when $k_e$ many of its vertices appear in the ordering. We give a $4.642$ approximation algorithm for GMSSC, coming close to the best possible bound of $4$, already for the classical special case (with all $k_e=1$) of min sum set cover (MSSC) studied by Feige, Lov{a}sz and Tetali, and improving upon the previous best known bound of $12.4$ due to Im, Sviridenko and van der Zwaan. Our algorithm is based on transforming the LP solution by a suitable kernel and applying randomized rounding. This also gives an LP-based $4$ approximation for MSSC. As part of the analysis of our algorithm, we also derive an inequality on the lower tail of a sum of independent Bernoulli random variables, which might be of independent interest and broader utility. Another well-known special case is the min sum vertex cover (MSVC) problem, in which the input hypergraph is a graph and $k_e = 1$, for every edge. We give a $16/9$ approximation for MSVC, and show a matching integrality gap for the natural LP relaxation. This improves upon the previous best $1.999946$ approximation of Barenholz, Feige and Peleg. (The claimed $1.79$ approximation result of Iwata, Tetali and Tripathi for the MSVC turned out have an unfortunate, seemingly unfixable, mistake in it.) Finally, we revisit MSSC and consider the $ell_p$ norm of cover-time of the hyperedges. Using a dual fitting argument, we show that the natural greedy algorithm achieves tight, up to NP-hardness, approximation guarantees of $(p+1)^{1+1/p}$, for all $pge 1$. For $p=1$, this gives yet another proof of the $4$ approximation for MSSC.
The known linear-time kernelizations for $d$-Hitting Set guarantee linear worst-case running times using a quadratic-size data structure (that is not fully initialized). Getting rid of this data structure, we show that problem kernels of asymptotically optimal size $O(k^d)$ for $d$-Hitting Set are computable in linear time and space. Additionally, we experimentally compare the linear-time kernelizations for $d$-Hitting Set to each other and to a classical data reduction algorithm due to Weihe.
We give a characterization result for the integrality gap of the natural linear programming relaxation for the vertex cover problem. We show that integrality gap of the standard linear programming relaxation for any graph G equals $left(2-frac{2}{chi^f(G)}right)$ where $chi^f(G)$ denotes the fractional chromatic number of G.
In the dynamic minimum set cover problem, a challenge is to minimize the update time while guaranteeing close to the optimal $min(O(log n), f)$ approximation factor. (Throughout, $m$, $n$, $f$, and $C$ are parameters denoting the maximum number of sets, number of elements, frequency, and the cost range.) In the high-frequency range, when $f=Omega(log n)$, this was achieved by a deterministic $O(log n)$-approximation algorithm with $O(f log n)$ amortized update time [Gupta et al. STOC17]. In the low-frequency range, the line of work by Gupta et al. [STOC17], Abboud et al. [STOC19], and Bhattacharya et al. [ICALP15, IPCO17, FOCS19] led to a deterministic $(1+epsilon)f$-approximation algorithm with $O(f log (Cn)/epsilon^2)$ amortized update time. In this paper we improve the latter update time and provide the first bounds that subsume (and sometimes improve) the state-of-the-art dynamic vertex cover algorithms. We obtain: 1. $(1+epsilon)f$-approximation ratio in $O(flog^2 (Cn)/epsilon^3)$ worst-case update time: No non-trivial worst-case update time was previously known for dynamic set cover. Our bound subsumes and improves by a logarithmic factor the $O(log^3 n/text{poly}(epsilon))$ worst-case update time for unweighted dynamic vertex cover (i.e., when $f=2$ and $C=1$) by Bhattacharya et al. [SODA17]. 2. $(1+epsilon)f$-approximation ratio in $Oleft((f^2/epsilon^3)+(f/epsilon^2) log Cright)$ amortized update time: This result improves the previous $O(f log (Cn)/epsilon^2)$ update time bound for most values of $f$ in the low-frequency range, i.e. whenever $f=o(log n)$. It is the first that is independent of $m$ and $n$. It subsumes the constant amortized update time of Bhattacharya and Kulkarni [SODA19] for unweighted dynamic vertex cover (i.e., when $f = 2$ and $C = 1$).
The optimization version of the Unique Label Cover problem is at the heart of the Unique Games Conjecture which has played an important role in the proof of several tight inapproximability results. In recent years, this problem has been also studied extensively from the point of view of parameterized complexity. Cygan et al. [FOCS 2012] proved that this problem is fixed-parameter tractable (FPT) and Wahlstrom [SODA 2014] gave an FPT algorithm with an improved parameter dependence. Subsequently, Iwata, Wahlstrom and Yoshida [2014] proved that the edge version of Unique Label Cover can be solved in linear FPT-time. That is, there is an FPT algorithm whose dependence on the input-size is linear. However, such an algorithm for the node version of the problem was left as an open problem. In this paper, we resolve this question by presenting the first linear-time FPT algorithm for Node Unique Label Cover.