No Arabic abstract
As the scales of data sets expand rapidly in some application scenarios, increasing efforts have been made to develop fast submodular maximization algorithms. This paper presents a currently the most efficient algorithm for maximizing general non-negative submodular objective functions subject to $k$-extendible system constraints. Combining the sampling process and the decreasing threshold strategy, our algorithm Sample Decreasing Threshold Greedy Algorithm (SDTGA) obtains an expected approximation guarantee of ($p-epsilon$) for monotone submodular functions and of ($p(1-p)-epsilon$) for non-monotone cases with expected computational complexity of only $O(frac{pn}{epsilon}lnfrac{r}{epsilon})$, where $r$ is the largest size of the feasible solutions, $0<p leq frac{1}{1+k}$ is the sampling probability and $0< epsilon < p$. If we fix the sampling probability $p$ as $frac{1}{1+k}$, we get the best approximation ratios for both monotone and non-monotone submodular functions which are $(frac{1}{1+k}-epsilon)$ and $(frac{k}{(1+k)^2}-epsilon)$ respectively. While the parameter $epsilon$ exists for the trade-off between the approximation ratio and the time complexity. Therefore, our algorithm can handle larger scale of submodular maximization problems than existing algorithms.
Constrained submodular maximization problems encompass a wide variety of applications, including personalized recommendation, team formation, and revenue maximization via viral marketing. The massive instances occurring in modern day applications can render existing algorithms prohibitively slow, while frequently, those instances are also inherently stochastic. Focusing on these challenges, we revisit the classic problem of maximizing a (possibly non-monotone) submodular function subject to a knapsack constraint. We present a simple randomized greedy algorithm that achieves a $5.83$ approximation and runs in $O(n log n)$ time, i.e., at least a factor $n$ faster than other state-of-the-art algorithms. The robustness of our approach allows us to further transfer it to a stochastic version of the problem. There, we obtain a $9$-approximation to the best adaptive policy, which is the first constant approximation for non-monotone objectives. Experimental evaluation of our algorithms showcases their improved performance on real and synthetic data.
The growing need to deal with massive instances motivates the design of algorithms balancing the quality of the solution with applicability. For the latter, an important measure is the emph{adaptive complexity}, capturing the number of sequential rounds of parallel computation needed. In this work we obtain the first emph{constant factor} approximation algorithm for non-monotone submodular maximization subject to a knapsack constraint with emph{near-optimal} $O(log n)$ adaptive complexity. Low adaptivity by itself, however, is not enough: one needs to account for the total number of function evaluations (or value queries) as well. Our algorithm asks $tilde{O}(n^2)$ value queries, but can be modified to run with only $tilde{O}(n)$ instead, while retaining a low adaptive complexity of $O(log^2n)$. Besides the above improvement in adaptivity, this is also the first emph{combinatorial} approach with sublinear adaptive complexity for the problem and yields algorithms comparable to the state-of-the-art even for the special cases of cardinality constraints or monotone objectives. Finally, we showcase our algorithms applicability on real-world datasets.
We study the problem of maximizing a non-monotone submodular function subject to a cardinality constraint in the streaming model. Our main contribution is a single-pass (semi-)streaming algorithm that uses roughly $O(k / varepsilon^2)$ memory, where $k$ is the size constraint. At the end of the stream, our algorithm post-processes its data structure using any offline algorithm for submodular maximization, and obtains a solution whose approximation guarantee is $frac{alpha}{1+alpha}-varepsilon$, where $alpha$ is the approximation of the offline algorithm. If we use an exact (exponential time) post-processing algorithm, this leads to $frac{1}{2}-varepsilon$ approximation (which is nearly optimal). If we post-process with the algorithm of Buchbinder and Feldman (Math of OR 2019), that achieves the state-of-the-art offline approximation guarantee of $alpha=0.385$, we obtain $0.2779$-approximation in polynomial time, improving over the previously best polynomial-time approximation of $0.1715$ due to Feldman et al. (NeurIPS 2018). It is also worth mentioning that our algorithm is combinatorial and deterministic, which is rare for an algorithm for non-monotone submodular maximization, and enjoys a fast update time of $O(frac{log k + log (1/alpha)}{varepsilon^2})$ per element.
We consider the problem of maximizing the multilinear extension of a submodular function subject a single matroid constraint or multiple packing constraints with a small number of adaptive rounds of evaluation queries. We obtain the first algorithms with low adaptivity for submodular maximization with a matroid constraint. Our algorithms achieve a $1-1/e-epsilon$ approximation for monotone functions and a $1/e-epsilon$ approximation for non-monotone functions, which nearly matches the best guarantees known in the fully adaptive setting. The number of rounds of adaptivity is $O(log^2{n}/epsilon^3)$, which is an exponential speedup over the existing algorithms. We obtain the first parallel algorithm for non-monotone submodular maximization subject to packing constraints. Our algorithm achieves a $1/e-epsilon$ approximation using $O(log(n/epsilon) log(1/epsilon) log(n+m)/ epsilon^2)$ parallel rounds, which is again an exponential speedup in parallel time over the existing algorithms. For monotone functions, we obtain a $1-1/e-epsilon$ approximation in $O(log(n/epsilon)log(m)/epsilon^2)$ parallel rounds. The number of parallel rounds of our algorithm matches that of the state of the art algorithm for solving packing LPs with a linear objective. Our results apply more generally to the problem of maximizing a diminishing returns submodular (DR-submodular) function.
We present a polynomial-time online algorithm for maximizing the conditional value at risk (CVaR) of a monotone stochastic submodular function. Given $T$ i.i.d. samples from an underlying distribution arriving online, our algorithm produces a sequence of solutions that converges to a ($1-1/e$)-approximate solution with a convergence rate of $O(T^{-1/4})$ for monotone continuous DR-submodular functions. Compared with previous offline algorithms, which require $Omega(T)$ space, our online algorithm only requires $O(sqrt{T})$ space. We extend our online algorithm to portfolio optimization for monotone submodular set functions under a matroid constraint. Experiments conducted on real-world datasets demonstrate that our algorithm can rapidly achieve CVaRs that are comparable to those obtained by existing offline algorithms.