ﻻ يوجد ملخص باللغة العربية
The movement of large-scale (tens of Terabytes and larger) data sets between high performance computing (HPC) facilities is an important and increasingly critical capability. A growing number of scientific collaborations rely on HPC facilities for tasks which either require large-scale data sets as input or produce large-scale data sets as output. In order to enable the transfer of these data sets as needed by the scientific community, HPC facilities must design and deploy the appropriate data transfer capabilities to allow users to do data placement at scale. This paper describes the Petascale DTN Project, an effort undertaken by four HPC facilities, which succeeded in achieving routine data transfer rates of over 1PB/week between the facilities. We describe the design and configuration of the Data Transfer Node (DTN) clusters used for large-scale data transfers at these facilities, the software tools used, and the performance tuning that enabled this capability.
Data engineering is becoming an increasingly important part of scientific discoveries with the adoption of deep learning and machine learning. Data engineering deals with a variety of data formats, storage, data extraction, transformation, and data m
The FFT of three-dimensional (3D) input data is an important computational kernel of numerical simulations and is widely used in High Performance Computing (HPC) codes running on a large number of processors. Performance of many scientific applicatio
Performance and energy are the two most important objectives for optimisation on modern parallel platforms. Latest research demonstrated the importance of workload distribution as a decision variable in the bi-objective optimisation for performance a
Performance tools for forthcoming heterogeneous exascale platforms must address two principal challenges when analyzing execution measurements. First, measurement of extreme-scale executions generates large volumes of performance data. Second, perfor
Data-intensive applications are becoming commonplace in all science disciplines. They are comprised of a rich set of sub-domains such as data engineering, deep learning, and machine learning. These applications are built around efficient data abstrac