Streaming Algorithms for News and Scientific Literature Recommendation: Submodular Maximization with a d-Knapsack Constraint

88 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Qilian Yu

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Qilian Yu - Easton Li Xu - Shuguang Cui

التعلم الآلي بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Submodular maximization problems belong to the family of combinatorial optimization problems and enjoy wide applications. In this paper, we focus on the problem of maximizing a monotone submodular function subject to a $d$-knapsack constraint, for which we propose a streaming algorithm that achieves a $left(frac{1}{1+2d}-epsilonright)$-approximation of the optimal value, while it only needs one single pass through the dataset without storing all the data in the memory. In our experiments, we extensively evaluate the effectiveness of our proposed algorithm via two applications: news recommendation and scientific literature recommendation. It is observed that the proposed streaming algorithm achieves both execution speedup and memory saving by several orders of magnitude, compared with existing approaches.

قيم البحث

238 - Georgios Amanatidis , Federico Fusco , Philip Lazos 2021

The growing need to deal with massive instances motivates the design of algorithms balancing the quality of the solution with applicability. For the latter, an important measure is the emph{adaptive complexity}, capturing the number of sequential rou nds of parallel computation needed. In this work we obtain the first emph{constant factor} approximation algorithm for non-monotone submodular maximization subject to a knapsack constraint with emph{near-optimal} $O(log n)$ adaptive complexity. Low adaptivity by itself, however, is not enough: one needs to account for the total number of function evaluations (or value queries) as well. Our algorithm asks $tilde{O}(n^2)$ value queries, but can be modified to run with only $tilde{O}(n)$ instead, while retaining a low adaptive complexity of $O(log^2n)$. Besides the above improvement in adaptivity, this is also the first emph{combinatorial} approach with sublinear adaptive complexity for the problem and yields algorithms comparable to the state-of-the-art even for the special cases of cardinality constraints or monotone objectives. Finally, we showcase our algorithms applicability on real-world datasets.

بنى وهياكل البيانات والخوارزميات التعلم الآلي

A Nearly-linear Time Algorithm for Submodular Maximization with a Knapsack Constraint

122 - Alina Ene , Huy L. Nguyen 2017

We consider the problem of maximizing a monotone submodular function subject to a knapsack constraint. Our main contribution is an algorithm that achieves a nearly-optimal, $1 - 1/e - epsilon$ approximation, using $(1/epsilon)^{O(1/epsilon^4)} n log^ 2{n}$ function evaluations and arithmetic operations. Our algorithm is impractical but theoretically interesting, since it overcomes a fundamental running time bottleneck of the multilinear extension relaxation framework. This is the main approach for obtaining nearly-optimal approximation guarantees for important classes of constraints but it leads to $Omega(n^2)$ running times, since evaluating the multilinear extension is expensive. Our algorithm maintains a fractional solution with only a constant number of entries that are strictly fractional, which allows us to overcome this obstacle.

بنى وهياكل البيانات والخوارزميات

Optimal Streaming Algorithms for Submodular Maximization with Cardinality Constraints

100 - Naor Alaluf , Alina Ene , Moran Feldman 2019

We study the problem of maximizing a non-monotone submodular function subject to a cardinality constraint in the streaming model. Our main contribution is a single-pass (semi-)streaming algorithm that uses roughly $O(k / varepsilon^2)$ memory, where $k$ is the size constraint. At the end of the stream, our algorithm post-processes its data structure using any offline algorithm for submodular maximization, and obtains a solution whose approximation guarantee is $frac{alpha}{1+alpha}-varepsilon$, where $alpha$ is the approximation of the offline algorithm. If we use an exact (exponential time) post-processing algorithm, this leads to $frac{1}{2}-varepsilon$ approximation (which is nearly optimal). If we post-process with the algorithm of Buchbinder and Feldman (Math of OR 2019), that achieves the state-of-the-art offline approximation guarantee of $alpha=0.385$, we obtain $0.2779$-approximation in polynomial time, improving over the previously best polynomial-time approximation of $0.1715$ due to Feldman et al. (NeurIPS 2018). It is also worth mentioning that our algorithm is combinatorial and deterministic, which is rare for an algorithm for non-monotone submodular maximization, and enjoys a fast update time of $O(frac{log k + log (1/alpha)}{varepsilon^2})$ per element.

بنى وهياكل البيانات والخوارزميات

Fast Adaptive Non-Monotone Submodular Maximization Subject to a Knapsack Constraint

102 - Georgios Amanatidis , Federico Fusco , Philip Lazos 2020

Constrained submodular maximization problems encompass a wide variety of applications, including personalized recommendation, team formation, and revenue maximization via viral marketing. The massive instances occurring in modern day applications can render existing algorithms prohibitively slow, while frequently, those instances are also inherently stochastic. Focusing on these challenges, we revisit the classic problem of maximizing a (possibly non-monotone) submodular function subject to a knapsack constraint. We present a simple randomized greedy algorithm that achieves a $5.83$ approximation and runs in $O(n log n)$ time, i.e., at least a factor $n$ faster than other state-of-the-art algorithms. The robustness of our approach allows us to further transfer it to a stochastic version of the problem. There, we obtain a $9$-approximation to the best adaptive policy, which is the first constant approximation for non-monotone objectives. Experimental evaluation of our algorithms showcases their improved performance on real and synthetic data.

بنى وهياكل البيانات والخوارزميات التعلم الآلي

Towards Nearly-linear Time Algorithms for Submodular Maximization with a Matroid Constraint

84 - Alina Ene , Huy L. Nguyen 2018

We consider fast algorithms for monotone submodular maximization subject to a matroid constraint. We assume that the matroid is given as input in an explicit form, and the goal is to obtain the best possible running times for important matroids. We d evelop a new algorithm for a emph{general matroid constraint} with a $1 - 1/e - epsilon$ approximation that achieves a fast running time provided we have a fast data structure for maintaining a maximum weight base in the matroid through a sequence of decrease weight operations. We construct such data structures for graphic matroids and partition matroids, and we obtain the emph{first algorithms} for these classes of matroids that achieve a nearly-optimal, $1 - 1/e - epsilon$ approximation, using a nearly-linear number of function evaluations and arithmetic operations.

بنى وهياكل البيانات والخوارزميات