No Arabic abstract
We study a many-server queueing model with server vacations, where the population size dynamics of servers and customers are coupled: a server may leave for vacation only when no customers await, and the capacity available to customers is directly affected by the number of servers on vacation. We focus on scaling regimes in which server dynamics and queue dynamics fluctuate at matching time scales, so that their limiting dynamics are coupled. Specifically, we argue that interesting coupled dynamics occur in (a) the Halfin-Whitt regime, (b) the nondegenerate slowdown regime, and (c) the intermediate, near Halfin-Whitt regime; whereas the dynamics asymptotically decouple in the other heavy traffic regimes. We characterize the limiting dynamics, which are different for each scaling regime. We consider relevant respective performance measures for regimes (a) and (b) --- namely, the probability of wait and the slowdown. While closed form formulas for these performance measures have been derived for models that do not accommodate server vacations, it is difficult to obtain closed form formulas for these performance measures in the setting with server vacations. Instead, we propose formulas that approximate these performance measures, and depend on the steady-state mean number of available servers and previously derived formulas for models without server vacations. We test the accuracy of these formulas numerically.
In this note, we apply Steins method to analyze the performance of general load balancing schemes in the many-server heavy-traffic regime. In particular, consider a load balancing system of $N$ servers and the distance of arrival rate to the capacity region is given by $N^{1-alpha}$ with $alpha > 1$. We are interested in the performance as $N$ goes to infinity under a large class of policies. We establish different asymptotics under different scalings and conditions. Specifically, (i) If the second moments linearly increase with $N$ with coefficients $sigma_a^2$ and $ u_s^2$, then for any $alpha > 4$, the distribution of the sum queue length scaled by $N^{-alpha}$ converges to an exponential random variable with mean $frac{sigma_a^2 + u_s^2}{2}$. (3) If the second moments quadratically increase with $N$ with coefficients $tilde{sigma}_a^2$ and $tilde{ u}_s^2$, then for any $alpha > 3$, the distribution of the sum queue length scaled by $N^{-alpha-1}$ converges to an exponential random variable with mean $frac{tilde{sigma}_a^2 + tilde{ u}_s^2}{2}$. Both results are simple applications of our previously developed framework of Steins method for heavy-traffic analysis in cite{zhou2020note}.
A scheduled arrival process is one in which the n th arrival is scheduled for time n, but instead occurs at a different time. The difference between the scheduled time and the arrival time is called the perturbation. The sequence of perturbations is assumed to be iid. We describe here the behavior of a single server queue fed by such traffic in which the processing times are deterministic. A particular focus is on perturbation with Pareto-like tails but with finite mean. We obtain tail approximations for the steady-state workload in both cases where the queue is critically loaded and under a heavy-traffic regime. A key to our approach is our analysis of the tail behavior of a sum of independent Bernoulli random variables with success probability following a power law with parameter strictly larger than 1.
This paper presents a heavy traffic analysis of the behavior of multi-class acyclic queueing networks in which the customers have deadlines. We assume the queueing system consists of J stations, and there are K different customer classes. Customers from each class arrive to the network according to independent renewal processes. The customers from each class are assigned a random deadline drawn from a deadline distribution associated with that class and they move from station to station according to a fixed acyclic route. The customers at a given node are processed according to the earliest-deadline-first (EDF) queue discipline. At any time, the customers of each type at each node have a lead time, the time until their deadline lapses. We model these lead times as a random counting measure on the real line. Under heavy traffic conditions and suitable scaling, it is proved that the measure-valued lead-time process converges to a deterministic function of the workload process.
The focus of this paper is on the asymptotics of large-time numbers of customers in time-periodic Markovian many-server queues with customer abandonment in heavy traffic. Limit theorems are obtained for the periodic number-of-customers processes under the fluid and diffusion scalings. Other results concern limits for general time-dependent queues and for time-homogeneous queues in steady state.
This paper presents a heavy-traffic analysis of the behavior of a single-server queue under an Earliest-Deadline-First (EDF) scheduling policy in which customers have deadlines and are served only until their deadlines elapse. The performance of the system is measured by the fraction of reneged work (the residual work lost due to elapsed deadlines) which is shown to be minimized by the EDF policy. The evolution of the lead time distribution of customers in queue is described by a measure-valued process. The heavy traffic limit of this (properly scaled) process is shown to be a deterministic function of the limit of the scaled workload process which, in turn, is identified to be a doubly reflected Brownian motion. This paper complements previous work by Doytchinov, Lehoczky and Shreve on the EDF discipline in which customers are served to completion even after their deadlines elapse. The fraction of reneged work in a heavily loaded system and the fraction of late work in the corresponding system without reneging are compared using explicit formulas based on the heavy traffic approximations. The formulas are validated by simulation results.