Local Deal-Agreement Based Monotonic Distributed Algorithms for Load Balancing in General Graphs

112 0 0.0 ( 0 )

Download Cite

Added by Manish Kumar

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Yefim Dinitz - Shlomi Dolev - Manish Kumar

Distributed Parallel and Cluster Computing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In computer networks, participants may cooperate in processing tasks, so that loads are balanced among them. We present local distributed algorithms that (repeatedly) use local imbalance criteria to transfer loads concurrently across the participants of the system, iterating until all loads are balanced. Our algorithms are based on a short local deal-agreement communication of proposal/deal, based on the neighborhood loads. They converge monotonically, always providing a better state as the execution progresses. Besides, our algorithms avoid making loads temporarily negative. Thus, they may be considered anytime ones, in the sense that they can be stopped at any time during the execution. We show that our synchronous load balancing algorithms achieve $epsilon$-Balanced state for the continuous setting and 1-Balanced state for the discrete setting in all graphs, within $O(n D log(n K/epsilon))$ and $O(n D log(n K/D) + n D^2)$ time, respectively, where $n$ is the number of nodes, $K$ is the initial discrepancy, $D$ is the graph diameter, and $epsilon$ is the final discrepancy. Our other monotonic synchronous and asynchronous algorithms for the discrete setting are generalizations of the first presented algorithms, where load balancing is performed concurrently with more than one neighbor. These algorithms arrive at a 1-Balanced state in time $O(n K^2)$ in general graphs, but have a potential to be faster as the loads are balanced among all neighbors, rather than with only one; we describe a scenario that demonstrates the potential for a fast ($O(1)$) convergence. Our asynchronous algorithm avoids the need to wait for the slowest participants activity prior to making the next load balancing steps as synchronous settings restrict. We also introduce a self-stabilizing version of our asynchronous algorithm.

rate research

Efficient Load-Balancing through Distributed Token Dropping

203 - Sebastian Brandt , Barbara Keller , Joel Rybicki 2020

We introduce a new graph problem, the token dropping game, and we show how to solve it efficiently in a distributed setting. We use the token dropping game as a tool to design an efficient distributed algorithm for stable orientations and more generally for locally optimal semi-matchings. The prior work by Czygrinow et al. (DISC 2012) finds a stable orientation in $O(Delta^5)$ rounds in graphs of maximum degree $Delta$, while we improve it to $O(Delta^4)$ and also prove a lower bound of $Omega(Delta)$.

Distributed Parallel and Cluster Computing

Distributed Selfish Load Balancing with Weights and Speeds

546 - C. P. J. Adolphs , P. Berenbrink 2011

In this paper we consider neighborhood load balancing in the context of selfish clients. We assume that a network of n processors and m tasks is given. The processors may have different speeds and the tasks may have different weights. Every task is controlled by a selfish user. The objective of the user is to allocate his/her task to a processor with minimum load. We revisit the concurrent probabilistic protocol introduced in [6], which works in sequential rounds. In each round every task is allowed to query the load of one randomly chosen neighboring processor. If that load is smaller the task will migrate to that processor with a suitably chosen probability. Using techniques from spectral graph theory we obtain upper bounds on the expected convergence time towards approximate and exact Nash equilibria that are significantly better than the previous results in [6]. We show results for uniform tasks on non-uniform processors and the general case where the tasks have different weights and the machines have speeds. To the best of our knowledge, these are the first results for this general setting.

Distributed Parallel and Cluster Computing Data Structures and Algorithms

A Survey on Load Balancing Algorithms for VM Placement in Cloud Computing

145 - Minxian Xu , Wenhong Tian , Rajkumar Buyya 2016

The emergence of cloud computing based on virtualization technologies brings huge opportunities to host virtual resource at low cost without the need of owning any infrastructure. Virtualization technologies enable users to acquire, configure and be charged on pay-per-use basis. However, Cloud data centers mostly comprise heterogeneous commodity servers hosting multiple virtual machines (VMs) with potential various specifications and fluctuating resource usages, which may cause imbalanced resource utilization within servers that may lead to performance degradation and service level agreements (SLAs) violations. To achieve efficient scheduling, these challenges should be addressed and solved by using load balancing strategies, which have been proved to be NP-hard problem. From multiple perspectives, this work identifies the challenges and analyzes existing algorithms for allocating VMs to PMs in infrastructure Clouds, especially focuses on load balancing. A detailed classification targeting load balancing algorithms for VM placement in cloud data centers is investigated and the surveyed algorithms are classified according to the classification. The goal of this paper is to provide a comprehensive and comparative understanding of existing literature and aid researchers by providing an insight for potential future enhancements.

Distributed Parallel and Cluster Computing

Distributed-Memory Load Balancing with Cyclic Token-based Work-Stealing Applied to Reverse Time Migration

97 - Italo A. S. Assis , Ant^onio D. S. Oliveira , Tiago Barros 2019

Reverse time migration (RTM) is a prominent technique in seismic imaging. Its resulting subsurface images are used in the industry to investigate with higher confidence the existence and the conditions of oil and gas reservoirs. Because of its high computational cost, RTM must make use of parallel computers. Balancing the workload distribution of an RTM is a growing challenge in distributed computing systems. The competition for shared resources and the differently-sized tasks of the RTM are some of the possible sources of load imbalance. Although many load balancing techniques exist, scaling up for large problems and large systems remains a challenge because synchronization overhead also scales. This paper proposes a cyclic token-based work-stealing (CTWS) algorithm for distributed memory systems applied to RTM. The novel cyclic token approach reduces the number of failed steals, avoids communication overhead, and simplifies the victim selection and the termination strategy. The proposed method is implemented as a C library using the one-sided communication feature of the message passing interface (MPI) standard. Results obtained by applying the proposed technique to balance the workload of a 3D RTM system present a factor of 14.1% speedup and reductions of the load imbalance of 78.4% when compared to the conventional static distribution.

Distributed Parallel and Cluster Computing

The Power of Both Choices: Practical Load Balancing for Distributed Stream Processing Engines

450 - Muhammad Anis Uddin Nasir , Gianmarco De Francisci Morales , Davidn Garcia-Soriano 2015

We study the problem of load balancing in distributed stream processing engines, which is exacerbated in the presence of skew. We introduce Partial Key Grouping (PKG), a new stream partitioning scheme that adapts the classical power of two choices to a distributed streaming setting by leveraging two novel techniques: key splitting and local load estimation. In so doing, it achieves better load balancing than key grouping while being more scalable than shuffle grouping. We test PKG on several large datasets, both real-world and synthetic. Compared to standard hashing, PKG reduces the load imbalance by up to several orders of magnitude, and often achieves nearly-perfect load balance. This result translates into an improvement of up to 60% in throughput and up to 45% in latency when deployed on a real Storm cluster.

Distributed Parallel and Cluster Computing