On system-wide safety staffing of large-scale parallel server networks


Abstract in English

We introduce a system-wide safety staffing (SWSS) parameter for multiclass multi-pool networks of any tree topology, Markovian or non-Markovian, in the Halfin-Whitt regime. This parameter can be regarded as the optimal reallocation of the capacity fluctuations (positive or negative) of order $sqrt{n}$ when each server pool employs a square-root staffing rule. We provide an explicit form of the SWSS as a function of the system parameters, which is derived using a graph theoretic approach based on Gaussian elimination. For Markovian networks, we give an equivalent characterization of the SWSS parameter via the drift parameters of the limiting diffusion. We show that if the SWSS parameter is negative, the limiting diffusion and the diffusion-scaled queueing processes are transient under any Markov control, and cannot have a stationary distribution when this parameter is zero. If it is positive, we show that the diffusion-scaled queueing processes are uniformly stabilizable, that is, there exists a scheduling policy under which the stationary distributions of the controlled processes are tight over the size of the network. In addition, there exists a control under which the limiting controlled diffusion is exponentially ergodic. Thus we have identified a necessary and sufficient condition for the uniform stabilizability of such networks in the Halfin-Whitt regime. We use a constant control resulting from the leaf elimination algorithm to stabilize the limiting controlled diffusion, while a family of Markov scheduling policies which are easy to compute are used to stabilize the diffusion-scaled processes. Finally, we show that under these controls the processes are exponentially ergodic and the stationary distributions have exponential tails.

Download