ﻻ يوجد ملخص باللغة العربية
Performance tools for forthcoming heterogeneous exascale platforms must address two principal challenges when analyzing execution measurements. First, measurement of extreme-scale executions generates large volumes of performance data. Second, performance metrics for heterogeneous applications are significantly sparse across code regions. To address these challenges, we developed a novel streaming aggregation approach to post-mortem analysis that employs both shared and distributed memory parallelism to aggregate sparse performance measurements from every rank, thread and GPU stream of a large-scale application execution. Analysis results are stored in a pair of sparse formats designed for efficient access to related data elements, supporting responsive interactive presentation and scalable data analytics. Empirical analysis shows that our implementation of this approach in HPCToolkit effectively processes measurement data from thousands of threads using a fraction of the compute resources employed by the application itself. Our approach is able to perform analysis up to 9.4 times faster and store analysis results 23 times smaller than HPCToolkit, providing a key building block for scalable exascale performance tools.
Astrophysical explosions such as supernovae are fascinating events that require sophisticated algorithms and substantial computational power to model. Castro and MAESTROeX are nuclear astrophysics codes that simulate thermonuclear fusion in the conte
As a highly scalable permissioned blockchain platform, Hyperledger Fabric supports a wide range of industry use cases ranging from governance to finance. In this paper, we propose a model to analyze the performance of a Hyperledgerbased system by usi
Big data applications and analytics are employed in many sectors for a variety of goals: improving customers satisfaction, predicting market behavior or improving processes in public health. These applications consist of complex software stacks that
The applications being developed within the U.S. Exascale Computing Project (ECP) to run on imminent Exascale computers will generate scientific results with unprecedented fidelity and record turn-around time. Many of these codes are based on particl
Much of the current focus in high-performance computing is on multi-threading, multi-computing, and graphics processing unit (GPU) computing. However, vectorization and non-parallel optimization techniques, which can often be employed additionally, a