Making Asynchronous Distributed Computations Robust to Noise

69 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ran Gelles

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Keren Censor-Hillel - Ran Gelles - Bernhard Haeupler

بنى وهياكل البيانات والخوارزميات النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We consider the problem of making distributed computations robust to noise, in particular to worst-case (adversarial) corruptions of messages. We give a general distributed interactive coding scheme which simulates any asynchronous distributed protocol while tolerating an optimal corruption of a $Theta(1/n)$ fraction of all messages while incurring a moderate blowup of $O(nlog^2 n)$ in the communication complexity. Our result is the first fully distributed interactive coding scheme in which the topology of the communication network is not known in advance. Prior work required either a coordinating node to be connected to all other nodes in the network or assumed a synchronous network in which all nodes already know the complete topology of the network.

قيم البحث

162 - Margalit Glasgow , Mary Wootters 2020

We study asynchronous finite sum minimization in a distributed-data setting with a central parameter server. While asynchrony is well understood in parallel settings where the data is accessible by all machines -- e.g., modifications of variance-redu ced gradient algorithms like SAGA work well -- little is known for the distributed-data setting. We develop an algorithm ADSAGA based on SAGA for the distributed-data setting, in which the data is partitioned between many machines. We show that with $m$ machines, under a natural stochastic delay model with an mean delay of $m$, ADSAGA converges in $tilde{O}left(left(n + sqrt{m}kapparight)log(1/epsilon)right)$ iterations, where $n$ is the number of component functions, and $kappa$ is a condition number. This complexity sits squarely between the complexity $tilde{O}left(left(n + kapparight)log(1/epsilon)right)$ of SAGA textit{without delays} and the complexity $tilde{O}left(left(n + mkapparight)log(1/epsilon)right)$ of parallel asynchronous algorithms where the delays are textit{arbitrary} (but bounded by $O(m)$), and the data is accessible by all. Existing asynchronous algorithms with distributed-data setting and arbitrary delays have only been shown to converge in $tilde{O}(n^2kappalog(1/epsilon))$ iterations. We empirically compare on least-squares problems the iteration complexity and wallclock performance of ADSAGA to existing parallel and distributed algorithms, including synchronous minibatch algorithms. Our results demonstrate the wallclock advantage of variance-reduced asynchronous approaches over SGD or synchronous approaches.

التعلم الآلي النظم الموزعة والتوازية والحوسبة العنقودية التعلم الالي

On Synchronous and Asynchronous Interaction in Distributed Systems

81 - Rob van Glabbeek , Ursula Goltz , Jens-Wolfhard Schicke 2008

When considering distributed systems, it is a central issue how to deal with interactions between components. In this paper, we investigate the paradigms of synchronous and asynchronous interaction in the context of distributed systems. We investigat e to what extent or under which conditions synchronous interaction is a valid concept for specification and implementation of such systems. We choose Petri nets as our system model and consider different notions of distribution by associating locations to elements of nets. First, we investigate the concept of simultaneity which is inherent in the semantics of Petri nets when transitions have multiple input places. We assume that tokens may only be taken instantaneously by transitions on the same location. We exhibit a hierarchy of `asynchronous Petri net classes by different assumptions on possible distributions. Alternatively, we assume that the synchronisations specified in a Petri net are crucial system properties. Hence transitions and their preplaces may no longer placed on separate locations. We then answer the question which systems may be implemented in a distributed way without restricting concurrency, assuming that locations are inherently sequential. It turns out that in both settings we find semi-structural properties of Petri nets describing exactly the problematic situations for interactions in distributed systems.

المنطق في علوم الحاسوب النظم الموزعة والتوازية والحوسبة العنقودية

Parallel and distributed asynchronous adaptive stochastic gradient methods

118 - Yangyang Xu , Yibo Xu , Yonggui Yan 2020

Stochastic gradient methods (SGMs) are the predominant approaches to train deep learning models. The adapti

التحسين والتحكم النظم الموزعة والتوازية والحوسبة العنقودية التحليل العددي

Distributed Distance-Bounded Network Design Through Distributed Convex Programming

100 - Michael Dinitz , Yasamin Nazari 2017

Solving linear programs is often a challenging task in distributed settings. While there are good algorithms for solving packing and covering linear programs in a distributed manner (Kuhn et al.~2006), this is essentially the only class of linear pro grams for which such an algorithm is known. In this work we provide a distributed algorithm for solving a different class of convex programs which we call distance-bounded network design convex programs. These can be thought of as relaxations of network design problems in which the connectivity requirement includes a distance constraint (most notably, graph spanners). Our algorithm runs in $O( (D/epsilon) log n)$ rounds in the $mathcal{LOCAL}$ model and finds a $(1+epsilon)$-approximation to the optimal LP solution for any $0 < epsilon leq 1$, where $D$ is the largest distance constraint. While solving linear programs in a distributed setting is interesting in its own right, this class of convex programs is particularly important because solving them is often a crucial step when designing approximation algorithms. Hence we almost immediately obtain new and improved distributed approximation algorithms for a variety of network design problems, including Basic $3$- and $4$-Spanner, Directed $k$-Spanner, Lowest Degree $k$-Spanner, and Shallow-Light Steiner Network Design with a spanning demand graph. Our algorithms do not require any heavy computation and essentially match the best-known centralized approximation algorithms, while previous approaches which do not use heavy computation give approximations which are worse than the best-known centralized bounds.

بنى وهياكل البيانات والخوارزميات النظم الموزعة والتوازية والحوسبة العنقودية التوافقية

G-Tran: Making Distributed Graph Transactions Fast

98 - Hongzhi Chen , Changji Li , Chenguang Zheng 2021

Graph transaction processing raises many unique challenges such as random data access due to the irregularity of graph structures, low throughput and high abort rate due to the relatively large read/write sets in graph transactions. To address these challenges, we present G-Tran -- an RDMA-enabled distributed in-memory graph database with serializable and snapshot isolation support. First, we propose a graph-native data store to achieve good data locality and fast data access for transactional updates and queries. Second, G-Tran adopts a fully decentralized architecture that leverages RDMA to process distributed transactions with the MPP model, which can achieve high performance by utilizing all computing resources. In addition, we propose a new MV-OCC implementation with two optimizations to address the issue of large read/write sets in graph transactions. Extensive experiments show that G-Tran achieves competitive performance compared with other popular graph databases on benchmark workloads.

قواعد البيانات النظم الموزعة والتوازية والحوسبة العنقودية

سجل دخول لتتمكن من نشر تعليقات