A Center in Your Neighborhood: Fairness in Facility Location

85 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Neil Lutz

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Christopher Jung - Sampath Kannan - Neil Lutz

بنى وهياكل البيانات والخوارزميات التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

When selecting locations for a set of facilities, standard clustering algorithms may place unfair burden on some individuals and neighborhoods. We formulate a fairness concept that takes local population densities into account. In particular, given $k$ facilities to locate and a population of size $n$, we define the neighborhood radius of an individual $i$ as the minimum radius of a ball centered at $i$ that contains at least $n/k$ individuals. Our objective is to ensure that each individual has a facility within at most a small constant factor of her neighborhood radius. We present several theoretical results: We show that optimizing this factor is NP-hard; we give an approximation algorithm that guarantees a factor of at most 2 in all metric spaces; and we prove matching lower bounds in some metric spaces. We apply a variant of this algorithm to real-world address data, showing that it is quite different from standard clustering algorithms and outperforms them on our objective function and balances the load between facilities more evenly.

قيم البحث

اقرأ أيضاً

Online Facility Location with Deletions

56 - Marek Cygan , Artur Czumaj , Marcin Mucha 2018

In this paper we study three previously unstudied variants of the online Facility Location problem, considering an intrinsic scenario when the clients and facilities are not only allowed to arrive to the system, but they can also depart at any moment . We begin with the study of a natural fully-dynamic online uncapacitated model where clients can be both added and removed. When a client arrives, then it has to be assigned either to an existing facility or to a new facility opened at the clients location. However, when a client who has been also one of the open facilities is to be removed, then our model has to allow to reconnect all clients that have been connected to that removed facility. In this model, we present an optimal O(log n_act / log log n_act)-competitive algorithm, where n_act is the number of active clients at the end of the input sequence. Next, we turn our attention to the capacitated Facility Location problem. We first note that if no deletions are allowed, then one can achieve an optimal competitive ratio of O(log n/ log log n), where n is the length of the sequence. However, when deletions are allowed, the capacitated version of the problem is significantly more challenging than the uncapacitated one. We show that still, using a more sophisticated algorithmic approach, one can obtain an online O(log m + log c log n)-competitive algorithm for the capacitated Facility Location problem in the fully dynamic model, where m is number of points in the input metric and c is the capacity of any open facility.

بنى وهياكل البيانات والخوارزميات

Online Facility Location on Semi-Random Streams

424 - Harry Lang 2017

In the streaming model, the order of the stream can significantly affect the difficulty of a problem. A $t$-semirandom stream was introduced as an interpolation between random-order ($t=1$) and adversarial-order ($t=n$) streams where an adversary int ercepts a random-order stream and can delay up to $t$ elements at a time. IITK Sublinear Open Problem #15 asks to find algorithms whose performance degrades smoothly as $t$ increases. We show that the celebrated online facility location algorithm achieves an expected competitive ratio of $O(frac{log t}{log log t})$. We present a matching lower bound that any randomized algorithm has an expected competitive ratio of $Omega(frac{log t}{log log t})$. We use this result to construct an $O(1)$-approximate streaming algorithm for $k$-median clustering that stores $O(k log t)$ points and has $O(k log t)$ worst-case update time. Our technique generalizes to any dissimilarity measure that satisfies a weak triangle inequality, including $k$-means, $M$-estimators, and $ell_p$ norms. The special case $t=1$ yields an optimal $O(k)$ space algorithm for random-order streams as well as an optimal $O(nk)$ time algorithm in the RAM model, closing a long line of research on this problem.

بنى وهياكل البيانات والخوارزميات

Individual Fairness for $k$-Clustering

171 - Sepideh Mahabadi , Ali Vakilian 2020

We give a local search based algorithm for $k$-median and $k$-means (and more generally for any $k$-clustering with $ell_p$ norm cost function) from the perspective of individual fairness. More precisely, for a point $x$ in a point set $P$ of size $n $, let $r(x)$ be the minimum radius such that the ball of radius $r(x)$ centered at $x$ has at least $n/k$ points from $P$. Intuitively, if a set of $k$ random points are chosen from $P$ as centers, every point $xin P$ expects to have a center within radius $r(x)$. An individually fair clustering provides such a guarantee for every point $xin P$. This notion of fairness was introduced in [Jung et al., 2019] where they showed how to get an approximately feasible $k$-clustering with respect to this fairness condition. In this work, we show how to get a bicriteria approximation for fair $k$-clustering: The $k$-median ($k$-means) cost of our solution is within a constant factor of the cost of an optimal fair $k$-clustering, and our solution approximately satisfies the fairness condition (also within a constant factor). Further, we complement our theoretical bounds with empirical evaluation.

بنى وهياكل البيانات والخوارزميات التعلم الآلي التعلم الالي

Approximating Soft-Capacitated Facility Location Problem With Uncertainty

460 - Shuxin Cai , Wenguo Yang , Yaohua Tang 2012

We first show that a better analysis of the algorithm for The Two-Sage Stochastic Facility Location Problem from Srinivasan cite{sri07} and the algorithm for The Robust Fault Tolerant Facility Location Problem from Byrka et al cite{bgs10} can render improved approximation factors of 2.206 and alpha+4 where alpha is the maximum number an adversary can close, respectively, and which are the best ratios so far. We then present new models for the soft-capacitated facility location problem with uncertainty and design constant factor approximation algorithms to solve them. We devise the stochastic and robust approaches to handle the uncertainty incorporated into the original model. Explicitly, in this paper we propose two new problem, named The 2-Stage Soft-Capacitated Facility Location Problem and The Robust Soft-Capacitated Facility Location Problem respectively, and present constant factor approximation algorithms for them both. Our method uses reductions between facility location problems and linear-cost models, the randomized thresholding technique of Srinivasan cite{sri07} and the filtering and clustering technique of Byrka et al cite{bgs10}.

بنى وهياكل البيانات والخوارزميات

Your fairness may vary: Group fairness of pretrained language models in toxic text classification

220 - Ioana Baldini , Dennis Wei , Karthikeyan Natesan Ramamurthy 2021

We study the performance-fairness trade-off in more than a dozen fine-tuned LMs for toxic text classification. We empirically show that no blanket statement can be made with respect to the bias of large versus regular versus compressed models. Moreov er, we find that focusing on fairness-agnostic performance metrics can lead to models with varied fairness characteristics.

الحساب واللغة التعلم الآلي