Online Facility Location on Semi-Random Streams


Abstract in English

In the streaming model, the order of the stream can significantly affect the difficulty of a problem. A $t$-semirandom stream was introduced as an interpolation between random-order ($t=1$) and adversarial-order ($t=n$) streams where an adversary intercepts a random-order stream and can delay up to $t$ elements at a time. IITK Sublinear Open Problem #15 asks to find algorithms whose performance degrades smoothly as $t$ increases. We show that the celebrated online facility location algorithm achieves an expected competitive ratio of $O(frac{log t}{log log t})$. We present a matching lower bound that any randomized algorithm has an expected competitive ratio of $Omega(frac{log t}{log log t})$. We use this result to construct an $O(1)$-approximate streaming algorithm for $k$-median clustering that stores $O(k log t)$ points and has $O(k log t)$ worst-case update time. Our technique generalizes to any dissimilarity measure that satisfies a weak triangle inequality, including $k$-means, $M$-estimators, and $ell_p$ norms. The special case $t=1$ yields an optimal $O(k)$ space algorithm for random-order streams as well as an optimal $O(nk)$ time algorithm in the RAM model, closing a long line of research on this problem.

Download