ترغب بنشر مسار تعليمي؟ اضغط هنا

Fast GPU-based calculations in few-body quantum scattering

56   0   0.0 ( 0 )
 نشر من قبل Vladimir Pomerantsev
 تاريخ النشر 2015
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

A principally novel approach towards solving the few-particle (many-dimensional) quantum scattering problems is described. The approach is based on a complete discretization of few-particle continuum and usage of massively parallel computations of integral kernels for scattering equations by means of GPU. The discretization for continuous spectrum of a few-particle Hamiltonian is realized with a projection of all scattering operators and wave functions onto the stationary wave-packet basis. Such projection procedure leads to a replacement of singular multidimensional integral equations with linear matrix ones having finite matrix elements. Different aspects of the employment of a multithread GPU computing for fast calculation of the matrix kernel of the equation are studied in detail. As a result, the fully realistic three-body scattering problem above the break-up threshold is solved on an ordinary desktop PC with GPU for a rather small computational time.



قيم البحث

اقرأ أيضاً

We present GPU accelerated simulations to calculate the annihilation energy of magnetic skyrmions in an atomistic spin model considering dipole-dipole, exchange, uniaxial-anisotropy and Dzyaloshinskii-Moriya interactions using the simplified string m ethod. The skyrmion annihilation energy is directly related to its thermal stability and is a key measure for the applicability of magnetic skyrmions to storage and logic devices. We investigate annihilations mediated by Bloch points as well as annihilations via boundaries for various interaction energies. Both processes show similar behaviour, with boundary annihilations resulting in slightly smaller energy barriers than Bloch point annihilations.
Drip-line nuclei have very different properties from those of the valley of stability, as they are weakly bound and resonant. Therefore, the models devised for stable nuclei can no longer be applied therein. Hence, a new theoretical tool, the Gamow S hell Model (GSM), has been developed to study the many-body states occurring at the limits of the nuclear chart. GSM is a configuration interaction model based on the use of the so-called Berggren basis, which contains bound, resonant and scattering states, so that inter-nucleon correlations are fully taken into account and the asymptotes of extended many-body wave functions are precisely handled. However, large complex symmetric matrices must be diagonalized in this framework, therefore the use of very powerful parallel machines is needed therein. In order to fully take advantage of their power, a 2D partitioning scheme using hybrid MPI/OpenMP parallelization has been developed in our GSM code. The specificities of the 2D partitioning scheme in the GSM framework will be described and illustrated with numerical examples. It will then be shown that the introduction of this scheme in the GSM code greatly enhances its capabilities.
We present concise, computationally efficient formulas for several quantities of interest -- including absorbed and scattered power, optical force (radiation pressure), and torque -- in scattering calculations performed using the boundary-element met hod (BEM) [also known as the method of moments (MOM)]. Our formulas compute the quantities of interest textit{directly} from the BEM surface currents with no need ever to compute the scattered electromagnetic fields. We derive our new formulas and demonstrate their effectiveness by computing power, force, and torque in a number of example geometries. Free, open-source software implementations of our formulas are available for download online.
The computational cost of quantum Monte Carlo (QMC) calculations of realistic periodic systems depends strongly on the method of storing and evaluating the many-particle wave function. Previous work [A. J. Williamson et al., Phys. Rev. Lett. 87, 2464 06 (2001); D. Alf`e and M. J. Gillan, Phys. Rev. B 70, 161101 (2004)] has demonstrated the reduction of the O(N^3) cost of evaluating the Slater determinant with planewaves to O(N^2) using localized basis functions. We compare four polynomial approximations as basis functions -- interpolating Lagrange polynomials, interpolating piecewise-polynomial-form (pp-) splines, and basis-form (B-) splines (interpolating and smoothing). All these basis functions provide a similar speedup relative to the planewave basis. The pp-splines have eight times the memory requirement of the other methods. To test the accuracy of the basis functions, we apply them to the ground state structures of Si, Al, and MgO. The polynomial approximations differ in accuracy most strongly for MgO and smoothing B-splines most closely reproduce the planewave value for of the variational Monte Carlo energy. Using separate approximations for the Laplacian of the orbitals increases the accuracy sufficiently to justify the increased memory requirement, making smoothing B-splines, with separate approximation for the Laplacian, the preferred choice for approximating planewave-represented orbitals in QMC calculations.
We report an efficient algorithm for calculating momentum-space integrals in solid state systems on modern graphics processing units (GPUs). Our algorithm is based on the tetrahedron method, which we demonstrate to be ideally suited for execution in a GPU framework. In order to achieve maximum performance, all floating point operations are executed in single precision. For benchmarking our implementation within the CUDA programming framework we calculate the orbital-resolved density of states in an iron-based superconductor. However, our algorithm is general enough for the achieved improvements to carry over to the calculation of other momentum integrals such as, e.g. susceptibilities. If our program code is integrated into an existing program for the central processing unit (CPU), i.e. when data transfer overheads exist, speedups of up to a factor $sim130$ compared to a pure CPU implementation can be achieved, largely depending on the problem size. In case our program code is integrated into an existing GPU program, speedups over a CPU implementation of up to a factor $sim165$ are possible, even for moderately sized workloads.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا