Universal Algorithms for Clustering Problems

85 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Arun Ganesh

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Arun Ganesh - Bruce M. Maggs - Debmalya Panigrahi

بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper presents universal algorithms for clustering problems, including the widely studied $k$-median, $k$-means, and $k$-center objectives. The input is a metric space containing all potential client locations. The algorithm must select $k$ cluster centers such that they are a good solution for any subset of clients that actually realize. Specifically, we aim for low regret, defined as the maximum over all subsets of the difference between the cost of the algorithms solution and that of an optimal solution. A universal algorithms solution $SOL$ for a clustering problem is said to be an $(alpha, beta)$-approximation if for all subsets of clients $C$, it satisfies $SOL(C) leq alpha cdot OPT(C) + beta cdot MR$, where $OPT(C)$ is the cost of the optimal solution for clients $C$ and $MR$ is the minimum regret achievable by any solution. Our main results are universal algorithms for the standard clustering objectives of $k$-median, $k$-means, and $k$-center that achieve $(O(1), O(1))$-approximations. These results are obtained via a novel framework for universal algorithms using linear programming (LP) relaxations. These results generalize to other $ell_p$-objectives and the setting where some subset of the clients are fixed. We also give hardness results showing that $(alpha, beta)$-approximation is NP-hard if $alpha$ or $beta$ is at most a certain constant, even for the widely studied special case of Euclidean metric spaces. This shows that in some sense, $(O(1), O(1))$-approximation is the strongest type of guarantee obtainable for universal clustering.

قيم البحث

136 - Stefanie Jegelka , Suvrit Sra , Arindam Banerjee 2009

In the past few years powerful generalizations to the Euclidean k-means problem have been made, such as Bregman clustering [7], co-clustering (i.e., simultaneous clustering of rows and columns of an input matrix) [9,18], and tensor clustering [8,34]. Like k-means, these more general problems also suffer from the NP-hardness of the associated optimization. Researchers have developed approximation algorithms of varying degrees of sophistication for k-means, k-medians, and more recently also for Bregman clustering [2]. However, there seem to be no approximation algorithms for Bregman co- and tensor clustering. In this paper we derive the first (to our knowledge) guaranteed methods for these increasingly important clustering settings. Going beyond Bregman divergences, we also prove an approximation factor for tensor clustering with arbitrary separable metrics. Through extensive experiments we evaluate the characteristics of our method, and show that it also has practical impact.

بنى وهياكل البيانات والخوارزميات التعلم الآلي

Approximation Algorithms for Socially Fair Clustering

179 - Yury Makarychev , Ali Vakilian 2021

We present an $(e^{O(p)} frac{log ell}{loglogell})$-approximation algorithm for socially fair clustering with the $ell_p$-objective. In this problem, we are given a set of points in a metric space. Each point belongs to one (or several) of $ell$ grou ps. The goal is to find a $k$-medians, $k$-means, or, more generally, $ell_p$-clustering that is simultaneously good for all of the groups. More precisely, we need to find a set of $k$ centers $C$ so as to minimize the maximum over all groups $j$ of $sum_{u text{ in group }j} d(u,C)^p$. The socially fair clustering problem was independently proposed by Ghadiri, Samadi, and Vempala [2021] and Abbasi, Bhaskara, and Venkatasubramanian [2021]. Our algorithm improves and generalizes their $O(ell)$-approximation algorithms for the problem. The natural LP relaxation for the problem has an integrality gap of $Omega(ell)$. In order to obtain our result, we introduce a strengthened LP relaxation and show that it has an integrality gap of $Theta(frac{log ell}{loglogell})$ for a fixed $p$. Additionally, we present a bicriteria approximation algorithm, which generalizes the bicriteria approximation of Abbasi et al. [2021].

بنى وهياكل البيانات والخوارزميات التعلم الآلي التعلم الالي

Faster Algorithms for Semi-Matching Problems

163 - Jittat Fakcharoenphol , Bundit Laekhanukit , Danupon Nanongkai 2010

We consider the problem of finding textit{semi-matching} in bipartite graphs which is also extensively studied under various names in the scheduling literature. We give faster algorithms for both weighted and unweighted case. For the weighted case, we give an $O(nmlog n)$-time algorithm, where $n$ is the number of vertices and $m$ is the number of edges, by exploiting the geometric structure of the problem. This improves the classical $O(n^3)$ algorithms by Horn [Operations Research 1973] and Bruno, Coffman and Sethi [Communications of the ACM 1974]. For the unweighted case, the bound could be improved even further. We give a simple divide-and-conquer algorithm which runs in $O(sqrt{n}mlog n)$ time, improving two previous $O(nm)$-time algorithms by Abraham [MSc thesis, University of Glasgow 2003] and Harvey, Ladner, Lovasz and Tamir [WADS 2003 and Journal of Algorithms 2006]. We also extend this algorithm to solve the textit{Balance Edge Cover} problem in $O(sqrt{n}mlog n)$ time, improving the previous $O(nm)$-time algorithm by Harada, Ono, Sadakane and Yamashita [ISAAC 2008].

بنى وهياكل البيانات والخوارزميات

Parameterized Algorithms for Partial Cover Problems

203 - Omid Amini , Fedor V. Fomin , Saket Saurabh 2008

Covering problems are fundamental classical problems in optimization, computer science and complexity theory. Typically an input to these problems is a family of sets over a finite universe and the goal is to cover the elements of the universe with a s few sets of the family as possible. The variations of covering problems include well known problems like Set Cover, Vertex Cover, Dominating Set and Facility Location to name a few. Recently there has been a lot of study on partial covering problems, a natural generalization of covering problems. Here, the goal is not to cover all the elements but to cover the specified number of elements with the minimum number of sets. In this paper we study partial covering problems in graphs in the realm of parameterized complexity. Classical (non-partial) version of all these problems have been intensively studied in planar graphs and in graphs excluding a fixed graph $H$ as a minor. However, the techniques developed for parameterized version of non-partial covering problems cannot be applied directly to their partial counterparts. The approach we use, to show that various partial covering problems are fixed parameter tractable on planar graphs, graphs of bounded local treewidth and graph excluding some graph as a minor, is quite different from previously known techniques. The main idea behind our approach is the concept of implicit branching. We find implicit branching technique to be interesting on its own and believe that it can be used for some other problems.

بنى وهياكل البيانات والخوارزميات

Approximation Algorithms for Restless Bandit Problems

158 - Sudipto Guha , Kamesh Munagala , Peng Shi 2009

The restless bandit problem is one of the most well-studied generalizations of the celebrated stochastic multi-armed bandit problem in decision theory. In its ultimate generality, the restless bandit problem is known to be PSPACE-Hard to approximate to any non-trivial factor, and little progress has been made despite its importance in modeling activity allocation under uncertainty. We consider a special case that we call Feedback MAB, where the reward obtained by playing each of n independent arms varies according to an underlying on/off Markov process whose exact state is only revealed when the arm is played. The goal is to design a policy for playing the arms in order to maximize the infinite horizon time average expected reward. This problem is also an instance of a Partially Observable Markov Decision Process (POMDP), and is widely studied in wireless scheduling and unmanned aerial vehicle (UAV) routing. Unlike the stochastic MAB problem, the Feedback MAB problem does not admit to greedy index-based optimal policies. We develop a novel and general duality-based algorithmic technique that yields a surprisingly simple and intuitive 2+epsilon-approximate greedy policy to this problem. We then define a general sub-class of restless bandit problems that we term Monotone bandits, for which our policy is a 2-approximation. Our technique is robust enough to handle generalizations of these problems to incorporate various side-constraints such as blocking plays and switching costs. This technique is also of independent interest for other restless bandit problems. By presenting the first (and efficient) O(1) approximations for non-trivial instances of restless bandits as well as of POMDPs, our work initiates the study of approximation algorithms in both these contexts.

بنى وهياكل البيانات والخوارزميات

سجل دخول لتتمكن من نشر تعليقات