ﻻ يوجد ملخص باللغة العربية
We present Task Bench, a parameterized benchmark designed to explore the performance of parallel and distributed programming systems under a variety of application scenarios. Task Bench lowers the barrier to benchmarking multiple programming systems by making the implementation for a given system orthogonal to the benchmarks themselves: every benchmark constructed with Task Bench runs on every Task Bench implementation. Furthermore, Task Benchs parameterization enables a wide variety of benchmark scenarios that distill the key characteristics of larger applications. We conduct a comprehensive study with implementations of Task Bench in 15 programming systems on up to 256 Haswell nodes of the Cori supercomputer. We introduce a novel metric, minimum effective task granularity to study the baseline runtime overhead of each system. We show that when running at scale, 100 {mu}s is the smallest granularity that even the most efficient systems can reliably support with current technologies. We also study each systems scalability, ability to hide communication and mitigate load imbalance.
Much recent progress in applications of machine learning models to NLP has been driven by benchmarks that evaluate models across a wide variety of tasks. However, these broad-coverage benchmarks have been mostly limited to English, and despite an inc
High-performance computing (HPC) is a major driver accelerating scientific research and discovery, from quantum simulations to medical therapeutics. The growing number of new HPC systems coming online are being furnished with various hardware compone
The performance of collective operations has been a critical issue since the advent of MPI. Many algorithms have been proposed for each MPI collective operation but none of them proved optimal in all situations. Different algorithms demonstrate super
Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstrac
The performance of biomolecular molecular dynamics simulations has steadily increased on modern high performance computing resources but acceleration of the analysis of the output trajectories has lagged behind so that analyzing simulations is becomi