ﻻ يوجد ملخص باللغة العربية
The development of the A64FX processor by Fujitsu has created a massive innovation in High-Performance Computing and the birth of Fugaku: the current worlds fastest supercomputer. A variety of tools are used to analyze the run-times and performances of several applications, and in particular, how these applications scale on the A64FX processor. We examine the performance and behavior of applications through OpenMP scaling and how their performance differs across different compilers on the new Ookami cluster at Stony Brook University as well as the Fugaku supercomputer at RIKEN in Japan.
The development of the A64FX processor by Fujitsu has been a massive innovation in vectorized processors and led to Fugaku: the current worlds fastest supercomputer. We use a variety of tools to analyze the behavior and performance of several OpenMP
The research community has proposed copious modifications to the Transformer architecture since it was introduced over three years ago, relatively few of which have seen widespread adoption. In this paper, we comprehensively evaluate many of these mo
Hydra is a header-only, templated and C++11-compliant framework designed to perform the typical bottleneck calculations found in common HEP data analyses on massively parallel platforms. The framework is implemented on top of the C++11 Standard Libra
In this work we demonstrate the usage of the VegasFlow library on multidevice situations: multi-GPU in one single node and multi-node in a cluster. VegasFlow is a new software for fast evaluation of highly parallelizable integrals based on Monte Carl
ZKCM is a C++ library developed for the purpose of multiprecision matrix computation, on the basis of the GNU MP and MPFR libraries. It provides an easy-to-use syntax and convenient functions for matrix manipulations including those often used in num