No Arabic abstract
In the streaming model, the order of the stream can significantly affect the difficulty of a problem. A $t$-semirandom stream was introduced as an interpolation between random-order ($t=1$) and adversarial-order ($t=n$) streams where an adversary intercepts a random-order stream and can delay up to $t$ elements at a time. IITK Sublinear Open Problem #15 asks to find algorithms whose performance degrades smoothly as $t$ increases. We show that the celebrated online facility location algorithm achieves an expected competitive ratio of $O(frac{log t}{log log t})$. We present a matching lower bound that any randomized algorithm has an expected competitive ratio of $Omega(frac{log t}{log log t})$. We use this result to construct an $O(1)$-approximate streaming algorithm for $k$-median clustering that stores $O(k log t)$ points and has $O(k log t)$ worst-case update time. Our technique generalizes to any dissimilarity measure that satisfies a weak triangle inequality, including $k$-means, $M$-estimators, and $ell_p$ norms. The special case $t=1$ yields an optimal $O(k)$ space algorithm for random-order streams as well as an optimal $O(nk)$ time algorithm in the RAM model, closing a long line of research on this problem.
In this paper we study three previously unstudied variants of the online Facility Location problem, considering an intrinsic scenario when the clients and facilities are not only allowed to arrive to the system, but they can also depart at any moment. We begin with the study of a natural fully-dynamic online uncapacitated model where clients can be both added and removed. When a client arrives, then it has to be assigned either to an existing facility or to a new facility opened at the clients location. However, when a client who has been also one of the open facilities is to be removed, then our model has to allow to reconnect all clients that have been connected to that removed facility. In this model, we present an optimal O(log n_act / log log n_act)-competitive algorithm, where n_act is the number of active clients at the end of the input sequence. Next, we turn our attention to the capacitated Facility Location problem. We first note that if no deletions are allowed, then one can achieve an optimal competitive ratio of O(log n/ log log n), where n is the length of the sequence. However, when deletions are allowed, the capacitated version of the problem is significantly more challenging than the uncapacitated one. We show that still, using a more sophisticated algorithmic approach, one can obtain an online O(log m + log c log n)-competitive algorithm for the capacitated Facility Location problem in the fully dynamic model, where m is number of points in the input metric and c is the capacity of any open facility.
In this paper we study the facility location problem in the online with recourse and dynamic algorithm models. In the online with recourse model, clients arrive one by one and our algorithm needs to maintain good solutions at all time steps with only a few changes to the previously made decisions (called recourse). We show that the classic local search technique can lead to a $(1+sqrt{2}+epsilon)$-competitive online algorithm for facility location with only $Oleft(frac{log n}{epsilon}logfrac1epsilonright)$ amortized facility and client recourse. We then turn to the dynamic algorithm model for the problem, where the main goal is to design fast algorithms that maintain good solutions at all time steps. We show that the result for online facility location, combined with the randomized local search technique of Charikar and Guha [10], leads to an $O(1+sqrt{2}+epsilon)$ approximation dynamic algorithm with amortized update time of $tilde O(n)$ in the incremental setting. Notice that the running time is almost optimal, since in general metric space it takes $Omega(n)$ time to specify a new clients position. The approximation factor of our algorithm also matches the best offline analysis of the classic local search algorithm. Finally, we study the fully dynamic model for facility location, where clients can both arrive and depart. Our main result is an $O(1)$-approximation algorithm in this model with $O(|F|)$ preprocessing time and $O(log^3 D)$ amortized update time for the HST metric spaces. Using the seminal results of Bartal [4] and Fakcharoenphol, Rao and Talwar [17], which show that any arbitrary $N$-point metric space can be embedded into a distribution over HSTs such that the expected distortion is at most $O(log N)$, we obtain a $O(log |F|)$ approximation with preprocessing time of $O(|F|^2log |F|)$ and $O(log^3 D)$ amortized update time.
Consider an online facility assignment problem where a set of facilities $F = { f_1, f_2, f_3, cdots, f_{|F|} }$ of equal capacity $l$ is situated on a metric space and customers arrive one by one in an online manner on that space. We assign a customer $c_i$ to a facility $f_j$ before a new customer $c_{i+1}$ arrives. The cost of this assignment is the distance between $c_i$ and $f_j$. The objective of this problem is to minimize the sum of all assignment costs. Recently Ahmed et al. (TCS, 806, pp. 455-467, 2020) studied the problem where the facilities are situated on a line and computed competitive ratio of Algorithm Greedy which assigns the customer to the nearest available facility. They computed competitive ratio of algorithm named Algorithm Optimal-Fill which assigns the new customer considering optimal assignment of all previous customers. They also studied the problem where the facilities are situated on a connected unweighted graph. In this paper we first consider that $F$ is situated on the vertices of a connected unweighted grid graph $G$ of size $r times c$ and customers arrive one by one having positions on the vertices of $G$. We show that Algorithm Greedy has competitive ratio $r times c + r + c$ and Algorithm Optimal-Fill has competitive ratio $O(r times c)$. We later show that the competitive ratio of Algorithm Optimal-Fill is $2|F|$ for any arbitrary graph. Our bound is tight and better than the previous result. We also consider the facilities are distributed arbitrarily on a plane and provide an algorithm for the scenario. We also provide an algorithm that has competitive ratio $(2n-1)$. Finally, we consider a straight line metric space and show that no algorithm for the online facility assignment problem has competitive ratio less than $9.001$.
We first show that a better analysis of the algorithm for The Two-Sage Stochastic Facility Location Problem from Srinivasan cite{sri07} and the algorithm for The Robust Fault Tolerant Facility Location Problem from Byrka et al cite{bgs10} can render improved approximation factors of 2.206 and alpha+4 where alpha is the maximum number an adversary can close, respectively, and which are the best ratios so far. We then present new models for the soft-capacitated facility location problem with uncertainty and design constant factor approximation algorithms to solve them. We devise the stochastic and robust approaches to handle the uncertainty incorporated into the original model. Explicitly, in this paper we propose two new problem, named The 2-Stage Soft-Capacitated Facility Location Problem and The Robust Soft-Capacitated Facility Location Problem respectively, and present constant factor approximation algorithms for them both. Our method uses reductions between facility location problems and linear-cost models, the randomized thresholding technique of Srinivasan cite{sri07} and the filtering and clustering technique of Byrka et al cite{bgs10}.
When selecting locations for a set of facilities, standard clustering algorithms may place unfair burden on some individuals and neighborhoods. We formulate a fairness concept that takes local population densities into account. In particular, given $k$ facilities to locate and a population of size $n$, we define the neighborhood radius of an individual $i$ as the minimum radius of a ball centered at $i$ that contains at least $n/k$ individuals. Our objective is to ensure that each individual has a facility within at most a small constant factor of her neighborhood radius. We present several theoretical results: We show that optimizing this factor is NP-hard; we give an approximation algorithm that guarantees a factor of at most 2 in all metric spaces; and we prove matching lower bounds in some metric spaces. We apply a variant of this algorithm to real-world address data, showing that it is quite different from standard clustering algorithms and outperforms them on our objective function and balances the load between facilities more evenly.