No Arabic abstract
Dynamic affinity load balancing of multi-type tasks on multi-skilled servers, when the service rate of each task type on each of the servers is known and can possibly be different from each other, is an open problem for over three decades. The goal is to do task assignment on servers in a real time manner so that the system becomes stable, which means that the queue lengths do not diverge to infinity in steady state (throughput optimality), and the mean task completion time is minimized (delay optimality). The fluid model planning, Max-Weight, and c-$mu$-rule algorithms have theoretical guarantees on optimality in some aspects for the affinity problem, but they consider a complicated queueing structure and either require the task arrival rates, the service rates of tasks on servers, or both. In many cases that are discussed in the introduction section, both task arrival rates and service rates of different task types on different servers are unknown. In this work, the Blind GB-PANDAS algorithm is proposed which is completely blind to task arrival rates and service rates. Blind GB-PANDAS uses an exploration-exploitation approach for load balancing. We prove that Blind GB-PANDAS is throughput optimal under arbitrary and unknown distributions for service times of different task types on different servers and unknown task arrival rates. Blind GB-PANDAS desires to route an incoming task to the server with the minimum weighted-workload, but since the service rates are unknown, such routing of incoming tasks is not guaranteed which makes the throughput optimality analysis more complicated than the case where service rates are known. Our extensive experimental results reveal that Blind GB-PANDAS significantly outperforms existing methods in terms of mean task completion time at high loads.
Dynamic affinity scheduling has been an open problem for nearly three decades. The problem is to dynamically schedule multi-type tasks to multi-skilled servers such that the resulting queueing system is both stable in the capacity region (throughput optimality) and the mean delay of tasks is minimized at high loads near the boundary of the capacity region (heavy-traffic optimality). As for applications, data-intensive analytics like MapReduce, Hadoop, and Dryad fit into this setting, where the set of servers is heterogeneous for different task types, so the pair of task type and server determines the processing rate of the task. The load balancing algorithm used in such frameworks is an example of affinity scheduling which is desired to be both robust and delay optimal at high loads when hot-spots occur. Fluid model planning, the MaxWeight algorithm, and the generalized $cmu$-rule are among the first algorithms proposed for affinity scheduling that have theoretical guarantees on being optimal in different senses, which will be discussed in the related work section. All these algorithms are not practical for use in data center applications because of their non-realistic assumptions. The join-the-shortest-queue-MaxWeight (JSQ-MaxWeight), JSQ-Priority, and weighted-workload algorithms are examples of load balancing policies for systems with two and three levels of data locality with a rack structure. In this work, we propose the Generalized-Balanced-Pandas algorithm (GB-PANDAS) for a system with multiple levels of data locality and prove its throughput optimality. We prove this result under an arbitrary distribution for service times, whereas most previous theoretical work assumes geometric distribution for service times. The extensive simulation results show that the GB-PANDAS algorithm alleviates the mean delay and has a better performance than the JSQ-MaxWeight algorithm by twofold
We consider the load balancing problem in large-scale heterogeneous systems with multiple dispatchers. We introduce a general framework called Local-Estimation-Driven (LED). Under this framework, each dispatcher keeps local (possibly outdated) estimates of queue lengths for all the servers, and the dispatching decision is made purely based on these local estimates. The local estimates are updated via infrequent communications between dispatchers and servers. We derive sufficient conditions for LED policies to achieve throughput optimality and delay optimality in heavy-traffic, respectively. These conditions directly imply delay optimality for many previous local-memory based policies in heavy traffic. Moreover, the results enable us to design new delay optimal policies for heterogeneous systems with multiple dispatchers. Finally, the heavy-traffic delay optimality of the LED framework directly resolves a recent open problem on how to design optimal load balancing schemes using delayed information.
The recently created IETF 6TiSCH working group combines the high reliability and low-energy consumption of IEEE 802.15.4e Time Slotted Channel Hopping with IPv6 for industrial Internet of Things. We propose a distributed link scheduling algorithm, called Local Voting, for 6TiSCH networks that adapts the schedule to the network conditions. The algorithm tries to equalize the link load (defined as the ratio of the queue length over the number of allocated cells) through cell reallocation. Local Voting calculates the number of cells to be added or released by the 6TiSCH Operation Sublayer (6top). Compared to a representative algorithm from the literature, Local Voting provides simultaneously high reliability and low end-to-end latency while consuming significantly less energy. Its performance has been examined and compared to On-the-fly algorithm in 6TiSCH simulator by modeling an industrial environment with 50 sensors.
We present the Signal Detection using Random-Forest Algorithm (SIDRA). SIDRA is a detection and classification algorithm based on the Machine Learning technique (Random Forest). The goal of this paper is to show the power of SIDRA for quick and accurate signal detection and classification. We first diagnose the power of the method with simulated light curves and try it on a subset of the Kepler space mission catalogue. We use five classes of simulated light curves (CONSTANT, TRANSIT, VARIABLE, MLENS and EB for constant light curves, transiting exoplanet, variable, microlensing events and eclipsing binaries, respectively) to analyse the power of the method. The algorithm uses four features in order to classify the light curves. The training sample contains 5000 light curves (1000 from each class) and 50000 random light curves for testing. The total SIDRA success ratio is $geq 90%$. Furthermore, the success ratio reaches 95 - 100$%$ for the CONSTANT, VARIABLE, EB, and MLENS classes and 92$%$ for the TRANSIT class with a decision probability of 60$%$. Because the TRANSIT class is the one which fails the most, we run a simultaneous fit using SIDRA and a Box Least Square (BLS) based algorithm for searching for transiting exoplanets. As a result, our algorithm detects 7.5$%$ more planets than a classic BLS algorithm, with better results for lower signal-to-noise light curves. SIDRA succeeds to catch 98$%$ of the planet candidates in the Kepler sample and fails for 7$%$ of the false alarms subset. SIDRA promises to be useful for developing a detection algorithm and/or classifier for large photometric surveys such as TESS and PLATO exoplanet future space missions.
This paper first presents a parallel solution for the Flowshop Scheduling Problem in parallel environment, and then proposes a novel load balancing strategy. The proposed Proportional Fairness Strategy (PFS) takes computational performance of computing process sets into account, and assigns additional load to computing nodes proportionally to their evaluated performance. In order to efficiently utilize the power of parallel resource, we also discuss the data structure used in communications among computational nodes and design an optimized data transfer strategy. This data transfer strategy combined with the proposed load balancing strategy have been implemented and tested on a super computer consisted of 86 CPUs using MPI as the middleware. The results show that the proposed PFS achieves better performance in terms of computing time than the existing Adaptive Contracting Within Neighborhood Strategy. We also show that the combination of both the Proportional Fairness Strategy and the proposed data transferring strategy achieves additional 13~15% improvement in efficiency of parallelism.