مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

AutoFlow: Hotspot-Aware, Dynamic Load Balancing for Distributed Stream Processing

72 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Liang Yuan

تاريخ النشر 2021

مجال البحث هندسة إلكترونية الهندسة المعلوماتية

والبحث باللغة English

تأليف Pengqi Lu - Liang Yuan - Yunquan Zhang

أنظمة وتحكم أنظمة وتحكم

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Stream applications are widely deployed on the cloud. While modern distributed streaming systems like Flink and Spark Streaming can schedule and execute them efficiently, streaming dataflows are often dynamically changing, which may cause computation imbalance and backpressure. We introduce AutoFlow, an automatic, hotspot-aware dynamic load balance system for streaming dataflows. It incorporates a centralized scheduler which monitors the load balance in the entire dataflow dynamically and implements state migrations correspondingly. The scheduler achieves these two tasks using a simple asynchronous distributed control message mechanism and a hotspot-diminishing algorithm. The timing mechanism supports implicit barriers and a highly efficient state-migration without global barriers or pauses to operators. It also supports a time-window based load-balance measurement and feeds them to the hotspot-diminishing algorithm without user interference. We implemented AutoFlow on top of Ray, an actor-based distributed execution framework. Our evaluation based on various streaming benchmark dataset shows that AutoFlow achieves good load-balance and incurs a low latency overhead in highly data-skew workload.

قيم البحث

117 - Muhammad Anis Uddin Nasir , Gianmarco De Francisci Morales , Davidn Garcia-Soriano 2015

We study the problem of load balancing in distributed stream processing engines, which is exacerbated in the presence of skew. We introduce Partial Key Grouping (PKG), a new stream partitioning scheme that adapts the classical power of two choices to a distributed streaming setting by leveraging two novel techniques: key splitting and local load estimation. In so doing, it achieves better load balancing than key grouping while being more scalable than shuffle grouping. We test PKG on several large datasets, both real-world and synthetic. Compared to standard hashing, PKG reduces the load imbalance by up to several orders of magnitude, and often achieves nearly-perfect load balance. This result translates into an improvement of up to 60% in throughput and up to 45% in latency when deployed on a real Storm cluster.

النظم الموزعة والتوازية والحوسبة العنقودية

Evaluation of Load Prediction Techniques for Distributed Stream Processing

177 - Kordian Gontarska , Morgan Geldenhuys , Dominik Scheinert 2021

Distributed Stream Processing (DSP) systems enable processing large streams of continuous data to produce results in near to real time. They are an essential part of many data-intensive applications and analytics platforms. The rate at which events a rrive at DSP systems can vary considerably over time, which may be due to trends, cyclic, and seasonal patterns within the data streams. A priori knowledge of incoming workloads enables proactive approaches to resource management and optimization tasks such as dynamic scaling, live migration of resources, and the tuning of configuration parameters during run-times, thus leading to a potentially better Quality of Service. In this paper we conduct a comprehensive evaluation of different load prediction techniques for DSP jobs. We identify three use-cases and formulate requirements for making load predictions specific to DSP jobs. Automatically optimized classical and Deep Learning methods are being evaluated on nine different datasets from typical DSP domains, i.e. the IoT, Web 2.0, and cluster monitoring. We compare model performance with respect to overall accuracy and training duration. Our results show that the Deep Learning methods provide the most accurate load predictions for the majority of the evaluated datasets.

النظم الموزعة والتوازية والحوسبة العنقودية الذكاء الاصطناعي

Distributed Optimal Generation and Load-Side Control for Frequency Regulation in Power Systems

352 - Luwei Yang , Tao Liu , Zhiyuan Tang 2020

In order to deal with issues caused by the increasing penetration of renewable resources in power systems, this paper proposes a novel distributed frequency control algorithm for each generating unit and controllable load in a transmission network to replace the conventional automatic generation control (AGC). The targets of the proposed control algorithm are twofold. First, it is to restore the nominal frequency and scheduled net inter-area power exchanges after an active power mismatch between generation and demand. Second, it is to optimally coordinate the active powers of all controllable units in a distributed manner. The designed controller only relies on local information, computation, and peer-to-peer communication between cyber-connected buses, and it is also robust against uncertain system parameters. Asymptotic stability of the closed-loop system under the designed algorithm is analysed by using a nonlinear structure-preserving model including the first-order turbine-governor dynamics. Finally, case studies validate the effectiveness of the proposed method.

أنظمة وتحكم أنظمة وتحكم

Privacy-Aware Load Ensemble Control: A Linearly-Solvable MDP Approach

143 - Ali Hassan , Deepjyoti Deka , Yury Dvorkin 2021

Demand response (DR) programs engage distributed demand-side resources, e.g., controllable residential and commercial loads, in providing ancillary services for electric power systems. Ensembles of these resources can help reducing system load peaks and meeting operational limits by adjusting their electric power consumption. To equip utilities or load aggregators with adequate decision-support tools for ensemble dispatch, we develop a Markov Decision Process (MDP) approach to optimally control load ensembles in a privacy-preserving manner. To this end, the concept of differential privacy is internalized into the MDP routine to protect transition probabilities and, thus, privacy of DR participants. The proposed approach also provides a trade-off between solution optimality and privacy guarantees, and is analyzed using real-world data from DR events in the New York University microgrid in New York, NY.

أنظمة وتحكم أنظمة وتحكم

Efficient Load-Balancing through Distributed Token Dropping

203 - Sebastian Brandt , Barbara Keller , Joel Rybicki 2020

We introduce a new graph problem, the token dropping game, and we show how to solve it efficiently in a distributed setting. We use the token dropping game as a tool to design an efficient distributed algorithm for stable orientations and more genera lly for locally optimal semi-matchings. The prior work by Czygrinow et al. (DISC 2012) finds a stable orientation in $O(Delta^5)$ rounds in graphs of maximum degree $Delta$, while we improve it to $O(Delta^4)$ and also prove a lower bound of $Omega(Delta)$.

النظم الموزعة والتوازية والحوسبة العنقودية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة حلوان

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

AutoFlow: Hotspot-Aware, Dynamic Load Balancing for Distributed Stream Processing

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً