No Arabic abstract
In the submodular cover problem, we are given a non-negative monotone submodular function $f$ over a ground set $E$ of items, and the goal is to choose a smallest subset $S subseteq E$ such that $f(S) = Q$ where $Q = f(E)$. In the stochastic version of the problem, we are given $m$ stochastic items which are different random variables that independently realize to some item in $E$, and the goal is to find a smallest set of stochastic items whose realization $R$ satisfies $f(R) = Q$. The problem captures as a special case the stochastic set cover problem and more generally, stochastic covering integer programs. We define an $r$-round adaptive algorithm to be an algorithm that chooses a permutation of all available items in each round $k in [r]$, and a threshold $tau_k$, and realizes items in the order specified by the permutation until the function value is at least $tau_k$. The permutation for each round $k$ is chosen adaptively based on the realization in the previous rounds, but the ordering inside each round remains fixed regardless of the realizations seen inside the round. Our main result is that for any integer $r$, there exists a poly-time $r$-round adaptive algorithm for stochastic submodular cover whose expected cost is $tilde{O}(Q^{{1}/{r}})$ times the expected cost of a fully adaptive algorithm. Prior to our work, such a result was not known even for the case of $r=1$ and when $f$ is the coverage function. On the other hand, we show that for any $r$, there exist instances of the stochastic submodular cover problem where no $r$-round adaptive algorithm can achieve better than $Omega(Q^{{1}/{r}})$ approximation to the expected cost of a fully adaptive algorithm. Our lower bound result holds even for coverage function and for algorithms with unbounded computational power.
Submodular optimization generalizes many classic problems in combinatorial optimization and has recently found a wide range of applications in machine learning (e.g., feature engineering and active learning). For many large-scale optimization problems, we are often concerned with the adaptivity complexity of an algorithm, which quantifies the number of sequential rounds where polynomially-many independent function evaluations can be executed in parallel. While low adaptivity is ideal, it is not sufficient for a distributed algorithm to be efficient, since in many practical applications of submodular optimization the number of function evaluations becomes prohibitively expensive. Motivated by these applications, we study the adaptivity and query complexity of adaptive submodular optimization. Our main result is a distributed algorithm for maximizing a monotone submodular function with cardinality constraint $k$ that achieves a $(1-1/e-varepsilon)$-approximation in expectation. This algorithm runs in $O(log(n))$ adaptive rounds and makes $O(n)$ calls to the function evaluation oracle in expectation. The approximation guarantee and query complexity are optimal, and the adaptivity is nearly optimal. Moreover, the number of queries is substantially less than in previous works. Last, we extend our results to the submodular cover problem to demonstrate the generality of our algorithm and techniques.
Submodular maximization is a general optimization problem with a wide range of applications in machine learning (e.g., active learning, clustering, and feature selection). In large-scale optimization, the parallel running time of an algorithm is governed by its adaptivity, which measures the number of sequential rounds needed if the algorithm can execute polynomially-many independent oracle queries in parallel. While low adaptivity is ideal, it is not sufficient for an algorithm to be efficient in practice---there are many applications of distributed submodular optimization where the number of function evaluations becomes prohibitively expensive. Motivated by these applications, we study the adaptivity and query complexity of submodular maximization. In this paper, we give the first constant-factor approximation algorithm for maximizing a non-monotone submodular function subject to a cardinality constraint $k$ that runs in $O(log(n))$ adaptive rounds and makes $O(n log(k))$ oracle queries in expectation. In our empirical study, we use three real-world applications to compare our algorithm with several benchmarks for non-monotone submodular maximization. The results demonstrate that our algorithm finds competitive solutions using significantly fewer rounds and queries.
In this paper, we study the tradeoff between the approximation guarantee and adaptivity for the problem of maximizing a monotone submodular function subject to a cardinality constraint. The adaptivity of an algorithm is the number of sequential rounds of queries it makes to the evaluation oracle of the function, where in every round the algorithm is allowed to make polynomially-many parallel queries. Adaptivity is an important consideration in settings where the objective function is estimated using samples and in applications where adaptivity is the main running time bottleneck. Previous algorithms achieving a nearly-optimal $1 - 1/e - epsilon$ approximation require $Omega(n)$ rounds of adaptivity. In this work, we give the first algorithm that achieves a $1 - 1/e - epsilon$ approximation using $O(ln{n} / epsilon^2)$ rounds of adaptivity. The number of function evaluations and additional running time of the algorithm are $O(n mathrm{poly}(log{n}, 1/epsilon))$.
In a minimum cost submodular cover problem (MinSMC), given a monotone non-decreasing submodular function $fcolon 2^V rightarrow mathbb{Z}^+$, a cost function $c: Vrightarrow mathbb R^{+}$, an integer $kleq f(V)$, the goal is to find a subset $Asubseteq V$ with the minimum cost such that $f(A)geq k$. MinSMC has a lot of applications in machine learning and data mining. In this paper, we design a parallel algorithm for MinSMC which obtains a solution with approximation ratio at most $frac{H(min{Delta,k})}{1-5varepsilon}$ with probability $1-3varepsilon$ in $O(frac{log mlog nlog^2 mn}{varepsilon^4})$ rounds, where $Delta=max_{vin V}f(v)$, $H(cdot)$ is the Hamornic number, $n=f(V)$, $m=|V|$ and $varepsilon$ is a constant in $(0,frac{1}{5})$. This is the first paper obtaining a parallel algorithm for the weighted version of the MinSMC problem with an approximation ratio arbitrarily close to $H(min{Delta,k})$.
Given a metric $(V,d)$ and a $textsf{root} in V$, the classic $textsf{$k$-TSP}$ problem is to find a tour originating at the $textsf{root}$ of minimum length that visits at least $k$ nodes in $V$. In this work, motivated by applications where the input to an optimization problem is uncertain, we study two stochast