ترغب بنشر مسار تعليمي؟ اضغط هنا

An efficient GPU algorithm for tetrahedron-based Brillouin-zone integration

83   0   0.0 ( 0 )
 نشر من قبل Daniel Guterding
 تاريخ النشر 2017
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

We report an efficient algorithm for calculating momentum-space integrals in solid state systems on modern graphics processing units (GPUs). Our algorithm is based on the tetrahedron method, which we demonstrate to be ideally suited for execution in a GPU framework. In order to achieve maximum performance, all floating point operations are executed in single precision. For benchmarking our implementation within the CUDA programming framework we calculate the orbital-resolved density of states in an iron-based superconductor. However, our algorithm is general enough for the achieved improvements to carry over to the calculation of other momentum integrals such as, e.g. susceptibilities. If our program code is integrated into an existing program for the central processing unit (CPU), i.e. when data transfer overheads exist, speedups of up to a factor $sim130$ compared to a pure CPU implementation can be achieved, largely depending on the problem size. In case our program code is integrated into an existing GPU program, speedups over a CPU implementation of up to a factor $sim165$ are possible, even for moderately sized workloads.



قيم البحث

اقرأ أيضاً

We develop a resource efficient step-merged quantum imaginary time evolution approach (smQITE) to solve for the ground state of a Hamiltonian on quantum computers. This heuristic method features a fixed shallow quantum circuit depth along the state e volution path. We use this algorithm to determine binding energy curves of a set of molecules, including H$_2$, H$_4$, H$_6$, LiH, HF, H$_2$O and BeH$_2$, and find highly accurate results. The required quantum resources of smQITE calculations can be further reduced by adopting the circuit form of the variational quantum eigensolver (VQE) technique, such as the unitary coupled cluster ansatz. We demonstrate that smQITE achieves a similar computational accuracy as VQE at the same fixed-circuit ansatz, without requiring a generally complicated high-dimensional non-convex optimization. Finally, smQITE calculations are carried out on Rigetti quantum processing units (QPUs), demonstrating that the approach is readily applicable on current noisy intermediate-scale quantum (NISQ) devices.
We demonstrate the first implementation of recently-developed fast explicit kinetic integration algorithms on modern graphics processing unit (GPU) accelerators. Taking as a generic test case a Type Ia supernova explosion with an extremely stiff ther monuclear network having 150 isotopic species and 1604 reactions coupled to hydrodynamics using operator splitting, we demonstrate the capability to solve of order 100 realistic kinetic networks in parallel in the same time that standard implicit methods can solve a single such network on a CPU. This orders-of-magnitude decrease in compute time for solving systems of realistic kinetic networks implies that important coupled, multiphysics problems in various scientific and technical fields that were intractible, or could be simulated only with highly schematic kinetic networks, are now computationally feasible.
We describe how to efficiently construct the quantum chemical Hamiltonian operator in matrix product form. We present its implementation as a density matrix renormalization group (DMRG) algorithm for quantum chemical applications in a purely matrix p roduct based framework. Existing implementations of DMRG for quantum chemistry are based on the traditional formulation of the method, which was developed from a viewpoint of Hilbert space decimation and attained a higher performance compared to straightforward implementations of matrix product based DMRG. The latter variationally optimizes a class of ansatz states known as matrix product states (MPS), where operators are correspondingly represented as matrix product operators (MPO). The MPO construction scheme presented here eliminates the previous performance disadvantages while retaining the additional flexibility provided by a matrix product approach; for example, the specification of expectation values becomes an input parameter. In this way, MPOs for different symmetries - abelian and non-abelian - and different relativistic and non-relativistic models may be solved by an otherwise unmodified program.
The 3D quasi-static particle-in-cell (PIC) algorithm is a very efficient method for modeling short-pulse laser or relativistic charged particle beam-plasma interactions. In this algorithm, the plasma response to a non-evolving laser or particle beam is calculated using Maxwells equations based on the quasi-static approximate equations that exclude radiation. The plasma fields are then used to advance the laser or beam forward using a large time step. The algorithm is many orders of magnitude faster than a 3D fully explicit relativistic electromagnetic PIC algorithm. It has been shown to be capable to accurately model the evolution of lasers and particle beams in a variety of scenarios. At the same time, an algorithm in which the fields, currents and Maxwell equations are decomposed into azimuthal harmonics has been shown to reduce the complexity of a 3D explicit PIC algorithm to that of a 2D algorithm when the expansion is truncated while maintaining accuracy for problems with near azimuthal symmetry. This hybrid algorithm uses a PIC description in r-z and a gridless description in $phi$. We describe a novel method that combines the quasi-static and hybrid PIC methods. This algorithm expands the fields, charge and current density into azimuthal harmonics. A set of the quasi-static field equations are derived for each harmonic. The complex amplitudes of the fields are then solved using the finite difference method. The beam and plasma particles are advanced in Cartesian coordinates using the total fields. Details on how this algorithm was implemented using a similar workflow to an existing quasi-static code, QuickPIC, are presented. The new code is called QPAD for QuickPIC with Azimuthal Decomposition. Benchmarks and comparisons between a fully 3D explicit PIC code, a full 3D quasi-static code, and the new quasi-static PIC code with azimuthal decomposition are also presented.
We propose a scheme to determine the energy-band dispersion of quasicrystals which does not require any periodic approximation and which directly provides the correct structure of the extended Brillouin zones. In the gap labelling viewpoint, this all ow to transpose the measure of the integrated density-of-states to the measure of the effective Brillouin-zone areas that are uniquely determined by the position of the Bragg peaks. Moreover we show that the Bragg vectors can be determined by the stability analysis of the law of recurrence used to generate the quasicrystal. Our analysis of the gap labelling in the quasi-momentum space opens the way to an experimental proof of the gap labelling itself within the framework of an optics experiment, polaritons, or with ultracold atoms.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا