New community

Subscribe to the gold package and get unlimited access to Shamra Academy

On the performance overhead tradeoff of distributed principal component analysis via data partitioning

159 0 0.0 ( 0 )

Download Cite

Added by Steven Weber

Publication date 2015

fields Informatics Engineering

and research's language is English

Authors Ni An - Steven Weber

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Principal component analysis (PCA) is not only a fundamental dimension reduction method, but is also a widely used network anomaly detection technique. Traditionally, PCA is performed in a centralized manner, which has poor scalability for large distributed systems, on account of the large network bandwidth cost required to gather the distributed state at a fusion center. Consequently, several recent works have proposed various distributed PCA algorithms aiming to reduce the communication overhead incurred by PCA without losing its inferential power. This paper evaluates the tradeoff between communication cost and solution quality of two distributed PCA algorithms on a real domain name system (DNS) query dataset from a large network. We also apply the distributed PCA algorithm in the area of network anomaly detection and demonstrate that the detection accuracy of both distributed PCA-based methods has little degradation in quality, yet achieves significant savings in communication bandwidth.

rate research

Mitigating the Performance-Efficiency Tradeoff in Resilient Memory Disaggregation

105 - Youngmoon Lee , Hasan Al Maruf , Mosharaf Chowdhury 2019

We present the design and implementation of a low-latency, low-overhead, and highly available resilient disaggregated cluster memory. Our proposed framework can access erasure-coded remote memory within a single-digit {mu}s read/write latency, significantly improving the performance-efficiency tradeoff over the state-of-the-art - it performs similar to in-memory replication with 1.6x lower memory overhead. We also propose a novel coding group placement algorithm for erasure-coded data, that provides load balancing while reducing the probability of data loss under correlated failures by an order of magnitude.

Distributed Parallel and Cluster Computing Networking and Internet Architecture

On the Aloha throughput-fairness tradeoff

126 - Nan Xie , Steven Weber 2016

A well-known inner bound of the stability region of the slotted Aloha protocol on the collision channel with n users assumes worst-case service rates (all user queues non-empty). Using this inner bound as a feasible set of achievable rates, a characterization of the throughput--fairness tradeoff over this set is obtained, where throughput is defined as the sum of the individual user rates, and two definitions of fairness are considered: the Jain-Chiu-Hawe function and the sum-user alpha-fair (isoelastic) utility function. This characterization is obtained using both an equality constraint and an inequality constraint on the throughput, and properties of the optimal controls, the optimal rates, and the fairness as a function of the target throughput are established. A key fact used in all theorems is the observation that all contention probability vectors that extremize the fairness functions take at most two non-zero values.

Information Theory Networking and Internet Architecture Performance

Enabling Reproducible Analysis of Complex Workflows on the Edge-to-Cloud Continuum

222 - Daniel Rosendo , Alexandru Costan (INSA Rennes 2021

Distributed digital infrastructures for computation and analytics are now evolving towards an interconnected ecosystem allowing complex applications to be executed from IoT Edge devices to the HPC Cloud (aka the Computing Continuum, the Digital Continuum, or the Transcontinuum). Understanding end-to-end performance in such a complex continuum is challenging. This breaks down to reconciling many, typically contradicting application requirements and constraints with low-level infrastructure design choices. One important challenge is to accurately reproduce relevant behaviors of a given application workflow and representative settings of the physical infrastructure underlying this complex continuum. We introduce a rigorous methodology for such a process and validate it through E2Clab. It is the first platform to support the complete experimental cycle across the Computing Continuum: deployment, analysis, optimization. Preliminary results with real-life use cases show that E2Clab allows one to understand and improve performance, by correlating it to the parameter settings, the resource usage and the specifics of the underlying infrastructure.

Distributed Parallel and Cluster Computing Networking and Internet Architecture Performance

Randomized algorithms for distributed computation of principal component analysis and singular value decomposition

174 - Huamin Li , Yuval Kluger , 2016

Randomized algorithms provide solutions to two ubiquitous problems: (1) the distributed calculation of a principal component analysis or singular value decomposition of a highly rectangular matrix, and (2) the distributed calculation of a low-rank approximation (in the form of a singular value decomposition) to an arbitrary matrix. Carefully honed algorithms yield results that are uniformly superior to those of the stock, deterministic implementations in Spark (the popular platform for distributed computation); in particular, whereas the stock software will without warning return left singular vectors that are far from numerically orthonormal, a significantly burnished randomized implementation generates left singular vectors that are numerically orthonormal to nearly the machine precision.

Distributed Parallel and Cluster Computing Numerical Analysis Numerical Analysis

An In-Depth Analysis of the Slingshot Interconnect

135 - Daniele De Sensi , Salvatore Di Girolamo , Kim H. McMahon 2020

The interconnect is one of the most critical components in large scale computing systems, and its impact on the performance of applications is going to increase with the system size. In this paper, we will describe Slingshot, an interconnection network for large scale computing systems. Slingshot is based on high-radix switches, which allow building exascale and hyperscale datacenters networks with at most three switch-to-switch hops. Moreover, Slingshot provides efficient adaptive routing and congestion control algorithms, and highly tunable traffic classes. Slingshot uses an optimized Ethernet protocol, which allows it to be interoperable with standard Ethernet devices while providing high performance to HPC applications. We analyze the extent to which Slingshot provides these features, evaluating it on microbenchmarks and on several applications from the datacenter and AI worlds, as well as on HPC applications. We find that applications running on Slingshot are less affected by congestion compared to previous generation networks.

Distributed Parallel and Cluster Computing Networking and Internet Architecture Performance

comments

Fetching comments

Aِl-Baath University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

On the performance overhead tradeoff of distributed principal component analysis via data partitioning

Ask ChatGPT about the research

No Arabic abstract

Read More