ترغب بنشر مسار تعليمي؟ اضغط هنا

Disaggregating and Consolidating Network Functionalities with SuperNIC

106   0   0.0 ( 0 )
 نشر من قبل Yizhou Shan
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Resource disaggregation has gained huge popularity in recent years. Existing works demonstrate how to disaggregate compute, memory, and storage resources. We, for the first time, demonstrate how to disaggregate network resources by proposing a new distributed hardware framework called SuperNIC. Each SuperNIC connects a small set of endpoints and consolidates network functionalities for these endpoints. We prototyped SuperNIC with FPGA and demonstrate its performance and cost benefits with real network functions and customized disaggregated applications.

قيم البحث

اقرأ أيضاً

The Kepler mission results indicate that systems of tightly-packed inner planets (STIPs) are present around of order 5% of FGK field stars (whose median age is ~5 Gyr). We propose that STIPs initially surrounded nearly all such stars and those observ ed are the final survivors of a process in which long-term metastability eventually ceases and the systems proceed to collisional consolidation or destruction, losing roughly equal fractions of systems every decade in time. In this context, we also propose that our Solar System initially contained additional large planets interior to the current orbit of Venus, which survived in a metastable dynamical configuration for 1-10% of the Solar Systems age. Long-term gravitational perturbations caused the system to orbit cross, leading to a cataclysmic event which left Mercury as the sole surviving relic.
Network-distributed optimization has attracted significant attention in recent years due to its ever-increasing applications. However, the classic decentralized gradient descent (DGD) algorithm is communication-inefficient for large-scale and high-di mensional network-distributed optimization problems. To address this challenge, many compressed DGD-based algorithms have been proposed. However, most of the existing works have high complexity and assume compressors with bounded noise power. To overcome these limitations, in this paper, we propose a new differential-coded compressed DGD (DC-DGD) algorithm. The key features of DC-DGD include: i) DC-DGD works with general SNR-constrained compressors, relaxing the bounded noise power assumption; ii) The differential-coded design entails the same convergence rate as the original DGD algorithm; and iii) DC-DGD has the same low-complexity structure as the original DGD due to a {em self-noise-reduction effect}. Moreover, the above features inspire us to develop a hybrid compression scheme that offers a systematic mechanism to minimize the communication cost. Finally, we conduct extensive experiments to verify the efficacy of the proposed DC-DGD and hybrid compressor.
Internet supercomputing is an approach to solving partitionable, computation-intensive problems by harnessing the power of a vast number of interconnected computers. This paper presents a new algorithm for the problem of using network supercomputing to perform a large collection of independent tasks, while dealing with undependable processors. The adversary may cause the processors to return bogus results for tasks with certain probabilities, and may cause a subset $F$ of the initial set of processors $P$ to crash. The adversary is constrained in two ways. First, for the set of non-crashed processors $P-F$, the emph{average} probability of a processor returning a bogus result is inferior to $frac{1}{2}$. Second, the adversary may crash a subset of processors $F$, provided the size of $P-F$ is bounded from below. We consider two models: the first bounds the size of $P-F$ by a fractional polynomial, the second bounds this size by a poly-logarithm. Both models yield adversaries that are much stronger than previously studied. Our randomized synchronous algorithm is formulated for $n$ processors and $t$ tasks, with $nle t$, where depending on the number of crashes each live processor is able to terminate dynamically with the knowledge that the problem is solved with high probability. For the adversary constrained by a fractional polynomial, the round complexity of the algorithm is $O(frac{t}{n^varepsilon}log{n}log{log{n}})$, its work is $O(tlog{n} log{log{n}})$ and message complexity is $O(nlog{n}log{log{n}})$. For the poly-log constrained adversary, the round complexity is $O(t)$, work is $O(t n^{varepsilon})$, %$O(t , poly log{n})$, and message complexity is $O(n^{1+varepsilon})$ %$O(n , poly log{n})$. All bounds are shown to hold with high probability.
Feedback amplification is a key technique for synthesizing various important functionalities, especially in electronic circuits involving op-amps. This paper presents a quantum version of this methodology, where the general phase-preserving quantum a mplifier and coherent (i.e., measurement-free) feedback are employed to construct various type of systems having useful functionalities: quant
79 - Jing Pan , Wendao Liu , Jing Zhou 2020
The freedom of fast iterations of distributed deep learning tasks is crucial for smaller companies to gain competitive advantages and market shares from big tech giants. HorovodRunner brings this process to relatively accessible spark clusters. There have been, however, no benchmark tests on HorovodRunner per se, nor specifically graph convolutional network (GCN, hereafter), and very limited scalability benchmark tests on Horovod, the predecessor requiring custom built GPU clusters. For the first time, we show that Databricks HorovodRunner achieves significant lift in scaling efficiency for the convolutional neural network (CNN, hereafter) based tasks on both GPU and CPU clusters, but not the original GCN task. We also implemented the Rectified Adam optimizer for the first time in HorovodRunner.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا