ترغب بنشر مسار تعليمي؟ اضغط هنا

Beyond Scaling: Calculable Error Bounds of the Power-of-Two-Choices Mean-Field Model in Heavy-Traffic

110   0   0.0 ( 0 )
 نشر من قبل Fnu Hairi
 تاريخ النشر 2020
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper provides a recipe for deriving calculable approximation errors of mean-field models in heavy-traffic with the focus on the well-known load balancing algorithm -- power-of-two-choices (Po2). The recipe combines Steins method for linearized mean-field models and State Space Concentration (SSC) based on geometric tail bounds. In particular, we divide the state space into two regions, a neighborhood near the mean-field equilibrium and the complement of that. We first use a tail bound to show that the steady-state probability being outside the neighborhood is small. Then, we use a linearized mean-field model and Steins method to characterize the generator difference, which provides the dominant term of the approximation error. From the dominant term, we are able to obtain an asymptotically-tight bound and a nonasymptotic upper bound, both are calculable bounds, not order-wise scaling results like most results in the literature. Finally, we compared the theoretical bounds with numerical evaluations to show the effectiveness of our results. We note that the simulation results show that both bounds are valid even for small size systems such as a system with only ten servers.



قيم البحث

اقرأ أيضاً

We apply the power-of-two-choices paradigm to a random walk on a graph: rather than moving to a uniform random neighbour at each step, a controller is allowed to choose from two independent uniform random neighbours. We prove that this allows the con troller to significantly accelerate the hitting and cover times in several natural graph classes. In particular, we show that the cover time becomes linear in the number $n$ of vertices on discrete tori and bounded degree trees, of order $mathcal{O}(n log log n)$ on bounded degree expanders, and of order $mathcal{O}(n (log log n)^2)$ on the ErdH{o}s-R{e}nyi random graph in a certain sparsely connected regime. We also consider the algorithmic question of computing an optimal strategy, and prove a dichotomy in efficiency between computing strategies for hitting and cover times.
91 - Yige Hong , Weina Wang 2021
Multiserver jobs, which are jobs that occupy multiple servers simultaneously during service, are prevalent in todays computing clusters. But little is known about the delay performance of systems with multiserver jobs. We consider queueing models for multiserver jobs in a scaling regime where the total number of servers in the system becomes large and meanwhile both the system load and the number of servers that a job needs scale with the total number of servers. Prior work has derived upper bounds on the queueing probability in this scaling regime. However, without proper lower bounds, the existing results cannot be used to differentiate between policies. In this paper, we study the delay performance by establishing sharp bounds on the mean waiting time of multiserver jobs, where the waiting time of a job is the time spent in queueing rather than in service. We first consider the commonly used First-Come-First-Serve (FCFS) policy and characterize the exact order of its mean waiting time. We then prove a lower bound on the mean waiting time of all policies, and demonstrate that there is an order gap between this lower bound and the mean waiting time under FCFS. We finally complement the lower bound with an achievability result: we show that under a priority policy that we call P-Priority, the mean waiting time achieves the order of the lower bound. This achievability result implies the tightness of the lower bound, the asymptotic optimality of P-Priority, and the strict suboptimality of FCFS.
105 - Nicolas Gast INRIA 2020
Gossip protocols form the basis of many smart collective adaptive systems. They are a class of fully decentralised, simple but robust protocols for the distribution of information throughout large scale networks with hundreds or thousands of nodes. M ean field analysis methods have made it possible to approximate and analyse performance aspects of such large scale protocols in an efficient way. Taking the gossip shuffle protocol as a benchmark, we evaluate a recently developed refined mean field approach. We illustrate the gain in accuracy this can provide for the analysis of medium size models analysing two key performance measures. We also show that refined mean field analysis requires special attention to correctly capture the coordination aspects of the gossip shuffle protocol.
Dynamic affinity scheduling has been an open problem for nearly three decades. The problem is to dynamically schedule multi-type tasks to multi-skilled servers such that the resulting queueing system is both stable in the capacity region (throughput optimality) and the mean delay of tasks is minimized at high loads near the boundary of the capacity region (heavy-traffic optimality). As for applications, data-intensive analytics like MapReduce, Hadoop, and Dryad fit into this setting, where the set of servers is heterogeneous for different task types, so the pair of task type and server determines the processing rate of the task. The load balancing algorithm used in such frameworks is an example of affinity scheduling which is desired to be both robust and delay optimal at high loads when hot-spots occur. Fluid model planning, the MaxWeight algorithm, and the generalized $cmu$-rule are among the first algorithms proposed for affinity scheduling that have theoretical guarantees on being optimal in different senses, which will be discussed in the related work section. All these algorithms are not practical for use in data center applications because of their non-realistic assumptions. The join-the-shortest-queue-MaxWeight (JSQ-MaxWeight), JSQ-Priority, and weighted-workload algorithms are examples of load balancing policies for systems with two and three levels of data locality with a rack structure. In this work, we propose the Generalized-Balanced-Pandas algorithm (GB-PANDAS) for a system with multiple levels of data locality and prove its throughput optimality. We prove this result under an arbitrary distribution for service times, whereas most previous theoretical work assumes geometric distribution for service times. The extensive simulation results show that the GB-PANDAS algorithm alleviates the mean delay and has a better performance than the JSQ-MaxWeight algorithm by twofold
Clustering is the propensity of nodes that share a common neighbour to be connected. It is ubiquitous in many networks but poses many modelling challenges. Clustering typically manifests itself by a higher than expected frequency of triangles, and th is has led to the principle of constructing networks from such building blocks. This approach has been generalised to networks being constructed from a set of more exotic subgraphs. As long as these are fully connected, it is then possible to derive mean-field models that approximate epidemic dynamics well. However, there are virtually no results for non-fully connected subgraphs. In this paper, we provide a general and automated approach to deriving a set of ordinary differential equations, or mean-field model, that describes, to a high degree of accuracy, the expected values of system-level quantities, such as the prevalence of infection. Our approach offers a previously unattainable degree of control over the arrangement of subgraphs and network characteristics such as classical node degree, variance and clustering. The combination of these features makes it possible to generate families of networks with different subgraph compositions while keeping classical network metrics constant. Using our approach, we show that higher-order structure realised either through the introduction of loops of different sizes or by generating clustered networks based on different subgraphs, leads to significant differences in epidemic dynamics despite controlling for basic network metrics.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا