Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Performance Modeling and Analysis of a Hyperledger-based System Using GSPN

104 0 0.0 ( 0 )

Download Cite

Added by Pu Yuan

Publication date 2020

fields Informatics Engineering

and research's language is English

Authors Pu Yuan - Kan Zheng - Xiong Xiong

Distributed Parallel and Cluster Computing Performance

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

As a highly scalable permissioned blockchain platform, Hyperledger Fabric supports a wide range of industry use cases ranging from governance to finance. In this paper, we propose a model to analyze the performance of a Hyperledgerbased system by using Generalised Stochastic Petri Nets (GSPN). This model decomposes a transaction flow into multiple phases and provides a simulation-based approach to obtain the system latency and throughput with a specific arrival rate. Based on this model, we analyze the impact of different configurations of ordering service on system performance to find out the bottleneck. Moreover, a mathematical configuration selection approach is proposed to determine the best configuration which can maximize the system throughput. Finally, extensive experiments are performed on a running system to validate the proposed model and approaches.

rate research

Preparing for Performance Analysis at Exascale

109 - Jonathon Anderson , Yumeng Liu , John Mellor-Crummey 2021

Performance tools for forthcoming heterogeneous exascale platforms must address two principal challenges when analyzing execution measurements. First, measurement of extreme-scale executions generates large volumes of performance data. Second, performance metrics for heterogeneous applications are significantly sparse across code regions. To address these challenges, we developed a novel streaming aggregation approach to post-mortem analysis that employs both shared and distributed memory parallelism to aggregate sparse performance measurements from every rank, thread and GPU stream of a large-scale application execution. Analysis results are stored in a pair of sparse formats designed for efficient access to related data elements, supporting responsive interactive presentation and scalable data analytics. Empirical analysis shows that our implementation of this approach in HPCToolkit effectively processes measurement data from thousands of threads using a fraction of the compute resources employed by the application itself. Our approach is able to perform analysis up to 9.4 times faster and store analysis results 23 times smaller than HPCToolkit, providing a key building block for scalable exascale performance tools.

Distributed Parallel and Cluster Computing Performance

Multi-GPU Performance Optimization of a CFD Code using OpenACC on Different Platforms

122 - Weicheng Xue , Christopher J. Roy 2020

This paper investigates the multi-GPU performance of a 3D buoyancy driven cavity solver using MPI and OpenACC directives on different platforms. The paper shows that decomposing the total problem in different dimensions affects the strong scaling performance significantly for the GPU. Without proper performance optimizations, it is shown that 1D domain decomposition scales poorly on multiple GPUs due to the noncontiguous memory access. The performance using whatever decompositions can be benefited from a series of performance optimizations in the paper. Since the buoyancy driven cavity code is latency-bounded on the clusters examined, a series of optimizations both agnostic and tailored to the platforms are designed to reduce the latency cost and improve memory throughput between hosts and devices efficiently. First, the parallel message packing/unpacking strategy developed for noncontiguous data movement between hosts and devices improves the overall performance by about a factor of 2. Second, transferring different data based on the stencil sizes for different variables further reduces the communication overhead. These two optimizations are general enough to be beneficial to stencil computations having ghost changes on all of the clusters tested. Third, GPUDirect is used to improve the communication on clusters which have the hardware and software support for direct communication between GPUs without staging CPUs memory. Finally, overlapping the communication and computations is shown to be not efficient on multi-GPUs if only using MPI or MPI+OpenACC. Although we believe our implementation has revealed enough overlap, the actual running does not utilize the overlap well due to a lack of asynchronous progression.

Distributed Parallel and Cluster Computing Performance

Performance Analysis of SPAD-based OFDM

88 - Yichen Li , Majid Safari , Robert Henderson 2019

In this paper, an analytical approach for the nonlinear distorted bit error rate performance of optical orthogonal frequency division multiplexing (O-OFDM) with single photon avalanche diode (SPAD) receivers is presented. Major distortion effects of passive quenching (PQ) and active quenching (AQ) SPAD receivers are analysed in this study. The performance analysis of DC-biased O-OFDM and asymmetrically clipped O-OFDM with PQ and AQ SPAD are derived. The comparison results show the maximum optical irradiance caused by the nonlinear distortion, which limits the transmission power and bit rate. The theoretical maximum bit rate of SPAD-based OFDM is found which is up to 1~Gbits/s. This approach supplies a closed-form analytical solution for designing an optimal SPAD-based system.

Information Theory Performance Signal Processing

A Static Analysis-based Cross-Architecture Performance Prediction Using Machine Learning

88 - Newsha Ardalani , Urmish Thakker , Aws Albarghouthi 2019

Porting code from CPU to GPU is costly and time-consuming; Unless much time is invested in development and optimization, it is not obvious, a priori, how much speed-up is achievable or how much room is left for improvement. Knowing the potential speed-up a priori can be very useful: It can save hundreds of engineering hours, help programmers with prioritization and algorithm selection. We aim to address this problem using machine learning in a supervised setting, using solely the single-threaded source code of the program, without having to run or profile the code. We propose a static analysis-based cross-architecture performance prediction framework (Static XAPP) which relies solely on program properties collected using static analysis of the CPU source code and predicts whether the potential speed-up is above or below a given threshold. We offer preliminary results that show we can achieve 94% accuracy in binary classification, in average, across different thresholds

Distributed Parallel and Cluster Computing Machine Learning

Sage: Using Unsupervised Learning for Scalable Performance Debugging in Microservices

118 - Yu Gan , Mingyu Liang , Sundar Dev 2021

Cloud applications are increasingly shifting from large monolithic services to complex graphs of loosely-coupled microservices. Despite the advantages of modularity and elasticity microservices offer, they also complicate cluster management and performance debugging, as dependencies between tiers introduce backpressure and cascading QoS violations. We present Sage, a machine learning-driven root cause analysis system for interactive cloud microservices. Sage leverages unsupervised ML models to circumvent the overhead of trace labeling, captures the impact of dependencies between microservices to determine the root cause of unpredictable performance online, and applies corrective actions to recover a cloud services QoS. In experiments on both dedicated local clusters and large clusters on Google Compute Engine we show that Sage consistently achieves over 93% accuracy in correctly identifying the root cause of QoS violations, and improves performance predictability.

Distributed Parallel and Cluster Computing Performance

comments

Fetching comments

Kalamoon Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Performance Modeling and Analysis of a Hyperledger-based System Using GSPN

Ask ChatGPT about the research

No Arabic abstract

Read More