ﻻ يوجد ملخص باللغة العربية
The life-cycle of a partial differential equation (PDE) solver is often characterized by three development phases: the development of a stable numerical discretization, development of a correct (verified) implementation, and the optimization of the implementation for different computer architectures. Often it is only after significant time and effort has been invested that the performance bottlenecks of a PDE solver are fully understood, and the precise details varies between different computer architectures. One way to mitigate this issue is to establish a reliable performance model that allows a numerical analyst to make reliable predictions of how well a numerical method would perform on a given computer architecture, before embarking upon potentially long and expensive implementation and optimization phases. The availability of a reliable performance model also saves developer effort as it both informs the developer on what kind of optimisations are beneficial, and when the maximum expected performance has been reached and optimisation work should stop. We show how discretization of a wave equation can be theoretically studied to understand the performance limitations of the method on modern computer architectures. We focus on the roofline model, now broadly used in the high-performance computing community, which considers the achievable performance in terms of the peak memory bandwidth and peak floating point performance of a computer with respect to algorithmic choices. A first principles analysis of operational intensity for key time-stepping finite-difference algorithms is presented. With this information available at the time of algorithm design, the expected performance on target computer systems can be used as a driver for algorithm design.
As we rapidly approach the frontiers of ultra large computing resources, software optimization is becoming of paramount interest to scientific application developers interested in efficiently leveraging all available on-Node computing capabilities an
Exascale computing will feature novel and potentially disruptive hardware architectures. Exploiting these to their full potential is non-trivial. Numerical modelling frameworks involving finite difference methods are currently limited by the static n
Performance tests and analyses are critical to effective HPC software development and are central components in the design and implementation of computational algorithms for achieving faster simulations on existing and future computing architectures
This dissertation introduces measurement-based performance modeling and prediction techniques for dense linear algebra algorithms. As a core principle, these techniques avoid executions of such algorithms entirely, and instead predict their performan
Many cloud service providers (CSPs) provide on-demand service at a price with a small delay. We propose a QoS-differentiated model where multiple SLAs deliver both on-demand service for latency-critical users and delayed services for delay-tolerant u