ترغب بنشر مسار تعليمي؟ اضغط هنا

Pushing Back the Limit of Ab-initio Quantum Transport Simulations on Hybrid Supercomputers

331   0   0.0 ( 0 )
 نشر من قبل Mathieu Luisier
 تاريخ النشر 2018
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

The capabilities of CP2K, a density-functional theory package and OMEN, a nano-device simulator, are combined to study transport phenomena from first-principles in unprecedentedly large nanostructures. Based on the Hamiltonian and overlap matrices generated by CP2K for a given system, OMEN solves the Schroedinger equation with open boundary conditions (OBCs) for all possible electron momenta and energies. To accelerate this core operation a robust algorithm called SplitSolve has been developed. It allows to simultaneously treat the OBCs on CPUs and the Schroedinger equation on GPUs, taking advantage of hybrid nodes. Our key achievements on the Cray-XK7 Titan are (i) a reduction in time-to-solution by more than one order of magnitude as compared to standard methods, enabling the simulation of structures with more than 50000 atoms, (ii) a parallel efficiency of 97% when scaling from 756 up to 18564 nodes, and (iii) a sustained performance of 15 DP-PFlop/s.

قيم البحث

اقرأ أيضاً

168 - Weile Jia , Han Wang , Mohan Chen 2020
For 35 years, {it ab initio} molecular dynamics (AIMD) has been the method of choice for modeling complex atomistic phenomena from first principles. However, most AIMD applications are limited by computational cost to systems with thousands of atoms at most. We report that a machine learning-based simulation protocol (Deep Potential Molecular Dynamics), while retaining {it ab initio} accuracy, can simulate more than 1 nanosecond-long trajectory of over 100 million atoms per day, using a highly optimized code (GPU DeePMD-kit) on the Summit supercomputer. Our code can efficiently scale up to the entire Summit supercomputer, attaining $91$ PFLOPS in double precision ($45.5%$ of the peak) and {$162$/$275$ PFLOPS in mixed-single/half precision}. The great accomplishment of this work is that it opens the door to simulating unprecedented size and time scales with {it ab initio} accuracy. It also poses new challenges to the next-generation supercomputer for a better integration of machine learning and physical modeling.
Recent developments in path integral methodology have significantly reduced the computational expense of including quantum mechanical effects in the nuclear motion in ab initio molecular dynamics simulations. However, the implementation of these deve lopments requires a considerable programming effort, which has hindered their adoption. Here we describe i-PI, an interface written in Python that has been designed to minimise the effort required to bring state-of-the-art path integral techniques to an electronic structure program. While it is best suited to first principles calculations and path integral molecular dynamics, i-PI can also be used to perform classical molecular dynamics simulations, and can just as easily be interfaced with an empirical forcefield code. To give just one example of the many potential applications of the interface, we use it in conjunction with the CP2K electronic structure package to showcase the importance of nuclear quantum effects in high pressure water.
By including a fraction of exact exchange (EXX), hybrid functionals reduce the self-interaction error in semi-local density functional theory (DFT), and thereby furnish a more accurate and reliable description of the electronic structure in systems t hroughout biology, chemistry, physics, and materials science. However, the high computational cost associated with the evaluation of all required EXX quantities has limited the applicability of hybrid DFT in the treatment of large molecules and complex condensed-phase materials. To overcome this limitation, we have devised a linear-scaling yet formally exact approach that utilizes a local representation of the occupied orbitals (e.g., maximally localized Wannier functions, MLWFs) to exploit the sparsity in the real-space evaluation of the quantum mechanical exchange interaction in finite-gap systems. In this work, we present a detailed description of the theoretical and algorithmic advances required to perform MLWF-based ab initio molecular dynamics (AIMD) simulations of large-scale condensed-phase systems at the hybrid DFT level. We provide a comprehensive description of the exx algorithm, which is currently implemented in the Quantum ESPRESSO program and employs a hybrid MPI/OpenMP parallelization scheme to efficiently utilize high-performance computing (HPC) resources. This is followed by a critical assessment of the accuracy and parallel performance of this approach when performing AIMD simulations of liquid water in the canonical ensemble. With access to HPC resources, we demonstrate that exx enables hybrid DFT based AIMD simulations of condensed-phase systems containing 500-1000 atoms with a walltime cost that is comparable to semi-local DFT. In doing so, exx takes us closer to routinely performing AIMD simulations of large-scale condensed-phase systems for sufficiently long timescales at the hybrid DFT level of theory.
We present SPARC: Simulation Package for Ab-initio Real-space Calculations. SPARC can perform Kohn-Sham density functional theory calculations for isolated systems such as molecules as well as extended systems such as crystals and surfaces, in both s tatic and dynamic settings. It is straightforward to install/use and highly competitive with state-of-the-art planewave codes, demonstrating comparable performance on a small number of processors and increasing advantages as the number of processors grows. Notably, SPARC brings solution times down to a few seconds for systems with $mathcal{O}(100-500)$ atoms on large-scale parallel computers, outperforming planewave counterparts by an order of magnitude and more.
Real-time time-dependent density functional theory (rt-TDDFT) with hybrid exchange-correlation functional has wide-ranging applications in chemistry and material science simulations. However, it can be thousands of times more expensive than a convent ional ground state DFT simulation, hence is limited to small systems. In this paper, we accelerate hybrid functional rt-TDDFT calculations using the parallel transport gauge formalism, and the GPU implementation on Summit. Our implementation can efficiently scale to 786 GPUs for a large system with 1536 silicon atoms, and the wall clock time is only 1.5 hours per femtosecond. This unprecedented speed enables the simulation of large systems with more than 1000 atoms using rt-TDDFT and hybrid functional.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا