No Arabic abstract
This paper studies load balancing for many-server ($N$ servers) systems. Each server has a buffer of size $b-1,$ and can have at most one job in service and $b-1$ jobs in the buffer. The service time of a job follows the Coxian-2 distribution. We focus on steady-state performance of load balancing policies in the heavy traffic regime such that the normalized load of system is $lambda = 1 - N^{-alpha}$ for $0<alpha<0.5.$ We identify a set of policies that achieve asymptotic zero waiting. The set of policies include several classical policies such as join-the-shortest-queue (JSQ), join-the-idle-queue (JIQ), idle-one-first (I1F) and power-of-$d$-choices (Po$d$) with $d=O(N^alphalog N)$. The proof of the main result is based on Steins method and state space collapse. A key technical contribution of this paper is the iterative state space collapse approach that leads to a simple generator approximation when applying Steins method.
We introduce a general framework for the mean-field analysis of large-scale load-balancing networks with general service distributions. Specifically, we consider a parallel server network that consists of N queues and operates under the $SQ(d)$ load balancing policy, wherein jobs have independent and identical service requirements and each incoming job is routed on arrival to the shortest of $d$ queues that are sampled uniformly at random from $N$ queues. We introduce a novel state representation and, for a large class of arrival processes, including renewal and time-inhomogeneous Poisson arrivals, and mild assumptions on the service distribution, show that the mean-field limit, as $N rightarrow infty$, of the state can be characterized as the unique solution of a sequence of coupled partial integro-differential equations, which we refer to as the hydrodynamic PDE. We use a numerical scheme to solve the PDE to obtain approximations to the dynamics of large networks and demonstrate the efficacy of these approximations using Monte Carlo simulations. We also illustrate how the PDE can be used to gain insight into network performance.
This paper considers the steady-state performance of load balancing algorithms in a many-server system with distributed queues. The system has $N$ servers, and each server maintains a local queue with buffer size $b-1,$ i.e. a server can hold at most one job in service and $b-1$ jobs in the queue. Jobs in the same queue are served according to the first-in-first-out (FIFO) order. The system is operated in a heavy-traffic regime such that the workload per server is $lambda = 1 - N^{-alpha}$ for $0.5leq alpha<1.$ We identify a set of algorithms such that the steady-state queues have the following universal scaling, where {em universal} means that it holds for any $alphain[0.5,1)$: (i) the number of of busy servers is $lambda N-o(1);$ and (ii) the number of servers with two jobs (one in service and one in queue) is $O(N^{alpha}log N);$ and (iii) the number of servers with more than two jobs is $Oleft(frac{1}{N^{r(1-alpha)-1}}right),$ where $r$ can be any positive integer independent of $N.$ The set of load balancing algorithms that satisfy the sufficient condition includes join-the-shortest-queue (JSQ), idle-one-first (I1F), and power-of-$d$-choices (Po$d$) with $dgeq N^alphalog^2 N.$ We further argue that the waiting time of such an algorithm is near optimal order-wise.
In this paper we consider neighborhood load balancing in the context of selfish clients. We assume that a network of n processors and m tasks is given. The processors may have different speeds and the tasks may have different weights. Every task is controlled by a selfish user. The objective of the user is to allocate his/her task to a processor with minimum load. We revisit the concurrent probabilistic protocol introduced in [6], which works in sequential rounds. In each round every task is allowed to query the load of one randomly chosen neighboring processor. If that load is smaller the task will migrate to that processor with a suitably chosen probability. Using techniques from spectral graph theory we obtain upper bounds on the expected convergence time towards approximate and exact Nash equilibria that are significantly better than the previous results in [6]. We show results for uniform tasks on non-uniform processors and the general case where the tasks have different weights and the machines have speeds. To the best of our knowledge, these are the first results for this general setting.
We introduce a new graph problem, the token dropping game, and we show how to solve it efficiently in a distributed setting. We use the token dropping game as a tool to design an efficient distributed algorithm for stable orientations and more generally for locally optimal semi-matchings. The prior work by Czygrinow et al. (DISC 2012) finds a stable orientation in $O(Delta^5)$ rounds in graphs of maximum degree $Delta$, while we improve it to $O(Delta^4)$ and also prove a lower bound of $Omega(Delta)$.
We consider the problem of deterministic load balancing of tokens in the discrete model. A set of $n$ processors is connected into a $d$-regular undirected network. In every time step, each processor exchanges some of its tokens with each of its neighbors in the network. The goal is to minimize the discrepancy between the number of tokens on the most-loaded and the least-loaded processor as quickly as possible. Rabani et al. (1998) present a general technique for the analysis of a wide class of discrete load balancing algorithms. Their approach is to characterize the deviation between the actual loads of a discrete balancing algorithm with the distribution generated by a related Markov chain. The Markov chain can also be regarded as the underlying model of a continuous diffusion algorithm. Rabani et al. showed that after time $T = O(log (Kn)/mu)$, any algorithm of their class achieves a discrepancy of $O(dlog n/mu)$, where $mu$ is the spectral gap of the transition matrix of the graph, and $K$ is the initial load discrepancy in the system. In this work we identify some natural additional conditions on deterministic balancing algorithms, resulting in a class of algorithms reaching a smaller discrepancy. This class contains well-known algorithms, eg., the Rotor-Router. Specifically, we introduce the notion of cumulatively fair load-balancing algorithms where in any interval of consecutive time steps, the total number of tokens sent out over an edge by a node is the same (up to constants) for all adjacent edges. We prove that algorithms which are cumulatively fair and where every node retains a sufficient part of its load in each step, achieve a discrepancy of $O(min{dsqrt{log n/mu},dsqrt{n}})$ in time $O(T)$. We also show that in general neither of these assumptions may be omitted without increasing discrepancy. We then show by a combinatorial potential reduction argument that any cumulatively fair scheme satisfying some additional assumptions achieves a discrepancy of $O(d)$ almost as quickly as the continuous diffusion process. This positive result applies to some of the simplest and most natural discrete load balancing schemes.