Improved Approximation Algorithms for Individually Fair Clustering

132 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Mustafa Yal\\c{c}{\\i}ner

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ali Vakilian - Mustafa Yalc{c}{i}ner

بنى وهياكل البيانات والخوارزميات الذكاء الاصطناعي أجهزة الكمبيوتر والمجتمع

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider the $k$-clustering problem with $ell_p$-norm cost, which includes $k$-median, $k$-means and $k$-center cost functions, under an individual notion of fairness proposed by Jung et al. [2020]: given a set of points $P$ of size $n$, a set of $k$ centers induces a fair clustering if for every point $vin P$, $v$ can find a center among its $n/k$ closest neighbors. Recently, Mahabadi and Vakilian [2020] showed how to get a $(p^{O(p)},7)$-bicriteria approximation for the problem of fair $k$-clustering with $ell_p$-norm cost: every point finds a center within distance at most $7$ times its distance to its $(n/k)$-th closest neighbor and the $ell_p$-norm cost of the solution is at most $p^{O(p)}$ times the cost of an optimal fair solution. In this work, for any $varepsilon>0$, we present an improved $(16^p +varepsilon,3)$-bicriteria approximation for the fair $k$-clustering with $ell_p$-norm cost. To achieve our guarantees, we extend the framework of [Charikar et al., 2002, Swamy, 2016] and devise a $16^p$-approximation algorithm for the facility location with $ell_p$-norm cost under matroid constraint which might be of an independent interest. Besides, our approach suggests a reduction from our individually fair clustering to a clustering with a group fairness requirement proposed by Kleindessner et al. [2019], which is essentially the median matroid problem [Krishnaswamy et al., 2011].

قيم البحث

179 - Yury Makarychev , Ali Vakilian 2021

We present an $(e^{O(p)} frac{log ell}{loglogell})$-approximation algorithm for socially fair clustering with the $ell_p$-objective. In this problem, we are given a set of points in a metric space. Each point belongs to one (or several) of $ell$ grou ps. The goal is to find a $k$-medians, $k$-means, or, more generally, $ell_p$-clustering that is simultaneously good for all of the groups. More precisely, we need to find a set of $k$ centers $C$ so as to minimize the maximum over all groups $j$ of $sum_{u text{ in group }j} d(u,C)^p$. The socially fair clustering problem was independently proposed by Ghadiri, Samadi, and Vempala [2021] and Abbasi, Bhaskara, and Venkatasubramanian [2021]. Our algorithm improves and generalizes their $O(ell)$-approximation algorithms for the problem. The natural LP relaxation for the problem has an integrality gap of $Omega(ell)$. In order to obtain our result, we introduce a strengthened LP relaxation and show that it has an integrality gap of $Theta(frac{log ell}{loglogell})$ for a fixed $p$. Additionally, we present a bicriteria approximation algorithm, which generalizes the bicriteria approximation of Abbasi et al. [2021].

بنى وهياكل البيانات والخوارزميات التعلم الآلي التعلم الالي

Fair Coresets and Streaming Algorithms for Fair k-Means Clustering

137 - Melanie Schmidt , Chris Schwiegelshohn , Christian Sohler 2018

We study fair clustering problems as proposed by Chierichetti et al. (NIPS 2017). Here, points have a sensitive attribute and all clusters in the solution are required to be balanced with respect to it (to counteract any form of data-inherent bias). Previous algorithms for fair clustering do not scale well. We show how to model and compute so-called coresets for fair clustering problems, which can be used to significantly reduce the input data size. We prove that the coresets are composable and show how to compute them in a streaming setting. Furthermore, we propose a variant of Lloyds algorithm that computes fair clusterings and extend it to a fair k-means++ clustering algorithm. We implement these algorithms and provide empirical evidence that the combination of our approximation algorithms and the coreset construction yields a scalable algorithm for fair k-means clustering.

بنى وهياكل البيانات والخوارزميات

Improved Approximation Algorithms for Inventory Problems

100 - Thomas Bosman , Neil Olver 2019

We give new approximation algorithms for the submodular joint replenishment problem and the inventory routing problem, using an iterative rounding approach. In both problems, we are given a set of $N$ items and a discrete time horizon of $T$ days in which given demands for the items must be satisfied. Ordering a set of items incurs a cost according to a set function, with properties depending on the problem under consideration. Demand for an item at time $t$ can be satisfied by an order on any day prior to $t$, but a holding cost is charged for storing the items during the intermediate period; the goal is to minimize the sum of the ordering and holding cost. Our approximation factor for both problems is $O(log log min(N,T))$; this improves exponentially on the previous best results.

بنى وهياكل البيانات والخوارزميات

Improved Approximation Algorithms for Relay Placement

97 - Alon Efrat , Sandor P. Fekete , Joseph S. B. Mitchell 2015

In the relay placement problem the input is a set of sensors and a number $r ge 1$, the communication range of a relay. In the one-tier version of the problem the objective is to place a minimum number of relays so that between every pair of sensors there is a path through sensors and/or relays such that the consecutive vertices of the path are within distance $r$ if both vertices are relays and within distance 1 otherwise. The two-tier version adds the restrictions that the path must go through relays, and not through sensors. We present a 3.11-approximation algorithm for the one-tier version and a PTAS for the two-tier version. We also show that the one-tier version admits no PTAS, assuming P $ e$ NP.

بنى وهياكل البيانات والخوارزميات

Approximation Algorithms for Bregman Co-clustering and Tensor Clustering

335 - Stefanie Jegelka , Suvrit Sra , Arindam Banerjee 2009

In the past few years powerful generalizations to the Euclidean k-means problem have been made, such as Bregman clustering [7], co-clustering (i.e., simultaneous clustering of rows and columns of an input matrix) [9,18], and tensor clustering [8,34]. Like k-means, these more general problems also suffer from the NP-hardness of the associated optimization. Researchers have developed approximation algorithms of varying degrees of sophistication for k-means, k-medians, and more recently also for Bregman clustering [2]. However, there seem to be no approximation algorithms for Bregman co- and tensor clustering. In this paper we derive the first (to our knowledge) guaranteed methods for these increasingly important clustering settings. Going beyond Bregman divergences, we also prove an approximation factor for tensor clustering with arbitrary separable metrics. Through extensive experiments we evaluate the characteristics of our method, and show that it also has practical impact.

بنى وهياكل البيانات والخوارزميات التعلم الآلي