ﻻ يوجد ملخص باللغة العربية
Complex applications running on multicore processors show a rich performance phenomenology. The growing number of cores per ccNUMA domain complicates performance analysis of memory-bound code since system noise, load imbalance, or task-based programming models can lead to thread desynchronization. Hence, the simplifying assumption that all cores execute the same loop can not be upheld. Motivated by observations on plain and modifi
Energy is now a first-class design constraint along with performance in all computing settings. Energy predictive modelling based on performance monitoring counts (PMCs) is the leading method used for prediction of energy consumption during an applic
Energy proportionality is the key design goal followed by architects of modern multicore CPUs. One of its implications is that optimization of an application for performance will also optimize it for energy. In this work, we show that energy proporti
In addition to hardware wall-time restrictions commonly seen in high-performance computing systems, it is likely that future systems will also be constrained by energy budgets. In the present work, finite difference algorithms of varying computationa
In the present paper we consider numerical methods to solve the discrete Schrodinger equation with a time dependent Hamiltonian (motivated by problems encountered in the study of spin systems). We will consider both short-range interactions, which le
In this paper, we use multithreaded fast Fourier transforms provided in three highly optimized packages, FFTW-2.1.5, FFTW-3.3.7, and Intel MKL FFT, to present a novel model-based parallel computing technique as a very effective and portable method fo