No Arabic abstract
The growing need to deal with massive instances motivates the design of algorithms balancing the quality of the solution with applicability. For the latter, an important measure is the emph{adaptive complexity}, capturing the number of sequential rounds of parallel computation needed. In this work we obtain the first emph{constant factor} approximation algorithm for non-monotone submodular maximization subject to a knapsack constraint with emph{near-optimal} $O(log n)$ adaptive complexity. Low adaptivity by itself, however, is not enough: one needs to account for the total number of function evaluations (or value queries) as well. Our algorithm asks $tilde{O}(n^2)$ value queries, but can be modified to run with only $tilde{O}(n)$ instead, while retaining a low adaptive complexity of $O(log^2n)$. Besides the above improvement in adaptivity, this is also the first emph{combinatorial} approach with sublinear adaptive complexity for the problem and yields algorithms comparable to the state-of-the-art even for the special cases of cardinality constraints or monotone objectives. Finally, we showcase our algorithms applicability on real-world datasets.
Constrained submodular maximization problems encompass a wide variety of applications, including personalized recommendation, team formation, and revenue maximization via viral marketing. The massive instances occurring in modern day applications can render existing algorithms prohibitively slow, while frequently, those instances are also inherently stochastic. Focusing on these challenges, we revisit the classic problem of maximizing a (possibly non-monotone) submodular function subject to a knapsack constraint. We present a simple randomized greedy algorithm that achieves a $5.83$ approximation and runs in $O(n log n)$ time, i.e., at least a factor $n$ faster than other state-of-the-art algorithms. The robustness of our approach allows us to further transfer it to a stochastic version of the problem. There, we obtain a $9$-approximation to the best adaptive policy, which is the first constant approximation for non-monotone objectives. Experimental evaluation of our algorithms showcases their improved performance on real and synthetic data.
We consider the problem of maximizing a monotone submodular function subject to a knapsack constraint. Our main contribution is an algorithm that achieves a nearly-optimal, $1 - 1/e - epsilon$ approximation, using $(1/epsilon)^{O(1/epsilon^4)} n log^2{n}$ function evaluations and arithmetic operations. Our algorithm is impractical but theoretically interesting, since it overcomes a fundamental running time bottleneck of the multilinear extension relaxation framework. This is the main approach for obtaining nearly-optimal approximation guarantees for important classes of constraints but it leads to $Omega(n^2)$ running times, since evaluating the multilinear extension is expensive. Our algorithm maintains a fractional solution with only a constant number of entries that are strictly fractional, which allows us to overcome this obstacle.
We study the problem of maximizing a non-monotone submodular function subject to a cardinality constraint in the streaming model. Our main contribution is a single-pass (semi-)streaming algorithm that uses roughly $O(k / varepsilon^2)$ memory, where $k$ is the size constraint. At the end of the stream, our algorithm post-processes its data structure using any offline algorithm for submodular maximization, and obtains a solution whose approximation guarantee is $frac{alpha}{1+alpha}-varepsilon$, where $alpha$ is the approximation of the offline algorithm. If we use an exact (exponential time) post-processing algorithm, this leads to $frac{1}{2}-varepsilon$ approximation (which is nearly optimal). If we post-process with the algorithm of Buchbinder and Feldman (Math of OR 2019), that achieves the state-of-the-art offline approximation guarantee of $alpha=0.385$, we obtain $0.2779$-approximation in polynomial time, improving over the previously best polynomial-time approximation of $0.1715$ due to Feldman et al. (NeurIPS 2018). It is also worth mentioning that our algorithm is combinatorial and deterministic, which is rare for an algorithm for non-monotone submodular maximization, and enjoys a fast update time of $O(frac{log k + log (1/alpha)}{varepsilon^2})$ per element.
We consider fast algorithms for monotone submodular maximization subject to a matroid constraint. We assume that the matroid is given as input in an explicit form, and the goal is to obtain the best possible running times for important matroids. We develop a new algorithm for a emph{general matroid constraint} with a $1 - 1/e - epsilon$ approximation that achieves a fast running time provided we have a fast data structure for maintaining a maximum weight base in the matroid through a sequence of decrease weight operations. We construct such data structures for graphic matroids and partition matroids, and we obtain the emph{first algorithms} for these classes of matroids that achieve a nearly-optimal, $1 - 1/e - epsilon$ approximation, using a nearly-linear number of function evaluations and arithmetic operations.
Submodular optimization generalizes many classic problems in combinatorial optimization and has recently found a wide range of applications in machine learning (e.g., feature engineering and active learning). For many large-scale optimization problems, we are often concerned with the adaptivity complexity of an algorithm, which quantifies the number of sequential rounds where polynomially-many independent function evaluations can be executed in parallel. While low adaptivity is ideal, it is not sufficient for a distributed algorithm to be efficient, since in many practical applications of submodular optimization the number of function evaluations becomes prohibitively expensive. Motivated by these applications, we study the adaptivity and query complexity of adaptive submodular optimization. Our main result is a distributed algorithm for maximizing a monotone submodular function with cardinality constraint $k$ that achieves a $(1-1/e-varepsilon)$-approximation in expectation. This algorithm runs in $O(log(n))$ adaptive rounds and makes $O(n)$ calls to the function evaluation oracle in expectation. The approximation guarantee and query complexity are optimal, and the adaptivity is nearly optimal. Moreover, the number of queries is substantially less than in previous works. Last, we extend our results to the submodular cover problem to demonstrate the generality of our algorithm and techniques.