ترغب بنشر مسار تعليمي؟ اضغط هنا

GAMER-2: a GPU-accelerated adaptive mesh refinement code -- accuracy, performance, and scalability

105   0   0.0 ( 0 )
 نشر من قبل Hsi-Yu Schive
 تاريخ النشر 2017
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

We present GAMER-2, a GPU-accelerated adaptive mesh refinement (AMR) code for astrophysics. It provides a rich set of features, including adaptive time-stepping, several hydrodynamic schemes, magnetohydrodynamics, self-gravity, particles, star formation, chemistry and radiative processes with GRACKLE, data analysis with yt, and memory pool for efficient object allocation. GAMER-2 is fully bitwise reproducible. For the performance optimization, it adopts hybrid OpenMP/MPI/GPU parallelization and utilizes overlapping CPU computation, GPU computation, and CPU-GPU communication. Load balancing is achieved using a Hilbert space-filling curve on a level-by-level basis without the need to duplicate the entire AMR hierarchy on each MPI process. To provide convincing demonstrations of the accuracy and performance of GAMER-2, we directly compare with Enzo on isolated disk galaxy simulations and with FLASH on galaxy cluster merger simulations. We show that the physical results obtained by different codes are in very good agreement, and GAMER-2 outperforms Enzo and FLASH by nearly one and two orders of magnitude, respectively, on the Blue Waters supercomputers using $1-256$ nodes. More importantly, GAMER-2 exhibits similar or even better parallel scalability compared to the other two codes. We also demonstrate good weak and strong scaling using up to 4096 GPUs and 65,536 CPU cores, and achieve a uniform resolution as high as $10{,}240^3$ cells. Furthermore, GAMER-2 can be adopted as an AMR+GPUs framework and has been extensively used for the wave dark matter ($psi$DM) simulations. GAMER-2 is open source (available at https://github.com/gamer-project/gamer) and new contributions are welcome.



قيم البحث

اقرأ أيضاً

We present the newly developed code, GAMER (GPU-accelerated Adaptive MEsh Refinement code), which has adopted a novel approach to improve the performance of adaptive mesh refinement (AMR) astrophysical simulations by a large factor with the use of th e graphic processing unit (GPU). The AMR implementation is based on a hierarchy of grid patches with an oct-tree data structure. We adopt a three-dimensional relaxing TVD scheme for the hydrodynamic solver, and a multi-level relaxation scheme for the Poisson solver. Both solvers have been implemented in GPU, by which hundreds of patches can be advanced in parallel. The computational overhead associated with the data transfer between CPU and GPU is carefully reduced by utilizing the capability of asynchronous memory copies in GPU, and the computing time of the ghost-zone values for each patch is made to diminish by overlapping it with the GPU computations. We demonstrate the accuracy of the code by performing several standard test problems in astrophysics. GAMER is a parallel code that can be run in a multi-GPU cluster system. We measure the performance of the code by performing purely-baryonic cosmological simulations in different hardware implementations, in which detailed timing analyses provide comparison between the computations with and without GPU(s) acceleration. Maximum speed-up factors of 12.19 and 10.47 are demonstrated using 1 GPU with 4096^3 effective resolution and 16 GPUs with 8192^3 effective resolution, respectively.
General-relativistic magnetohydrodynamic (GRMHD) simulations have revolutionized our understanding of black-hole accretion. Here, we present a GPU-accelerated GRMHD code H-AMR with multi-faceted optimizations that, collectively, accelerate computatio n by 2-5 orders of magnitude for a wide range of applications. Firstly, it involves a novel implementation of a spherical-polar grid with 3D adaptive mesh refinement that operates in each of the 3 dimensions independently. This allows us to circumvent the Courant condition near the polar singularity, which otherwise cripples high-res computational performance. Secondly, we demonstrate that local adaptive time-stepping (LAT) on a logarithmic spherical-polar grid accelerates computation by a factor of $lesssim10$ compared to traditional hierarchical time-stepping approaches. Jointly, these unique features lead to an effective speed of $sim10^9$ zone-cycles-per-second-per-node on 5,400 NVIDIA V100 GPUs (i.e., 900 nodes of the OLCF Summit supercomputer). We demonstrate its computational performance by presenting the first GRMHD simulation of a tilted thin accretion disk threaded by a toroidal magnetic field around a rapidly spinning black hole. With an effective resolution of $13$,$440times4$,$608times8$,$092$ cells, and a total of $lesssim22$ billion cells and $sim0.65times10^8$ timesteps, it is among the largest astrophysical simulations ever performed. We find that frame-dragging by the black hole tears up the disk into two independently precessing sub-disks. The innermost sub-disk rotation axis intermittently aligns with the black hole spin, demonstrating for the first time that such long-sought alignment is possible in the absence of large-scale poloidal magnetic fields.
We present a new special relativistic hydrodynamics (SRHD) code capable of handling coexisting ultra-relativistically hot and non-relativistically cold gases. We achieve this by designing a new algorithm for conversion between primitive and conserved variables in the SRHD solver, which incorporates a realistic ideal-gas equation of state covering both the relativistic and non-relativistic regimes. The code can handle problems involving a Lorentz factor as high as $10^6$ and optimally avoid the catastrophic cancellation. In addition, we have integrated this new SRHD solver into the code GAMER (https://github.com/gamer-project/gamer) to support adaptive mesh refinement and hybrid OpenMP/MPI/GPU parallelization. It achieves a peak performance of $7times 10^{7}$ cell updates per second on a single Tesla P100 GPU and scales well to 2048 GPUs. We apply this code to two interesting astrophysical applications: (a) an asymmetric explosion source on the relativistic blast wave and (b) the flow acceleration and limb-brightening of relativistic jets.
This paper describes the open-source code Enzo, which uses block-structured adaptive mesh refinement to provide high spatial and temporal resolution for modeling astrophysical fluid flows. The code is Cartesian, can be run in 1, 2, and 3 dimensions, and supports a wide variety of physics including hydrodynamics, ideal and non-ideal magnetohydrodynamics, N-body dynamics (and, more broadly, self-gravity of fluids and particles), primordial gas chemistry, optically-thin radiative cooling of primordial and metal-enriched plasmas (as well as some optically-thick cooling models), radiation transport, cosmological expansion, and models for star formation and feedback in a cosmological context. In addition to explaining the algorithms implemented, we present solutions for a wide range of test problems, demonstrate the codes parallel performance, and discuss the Enzo collaborations code development methodology.
We have developed a simulation code with the techniques which enhance both spatial and time resolution of the PM method for which the spatial resolution is restricted by the spacing of structured mesh. The adaptive mesh refinement (AMR) technique sub divides the cells which satisfy the refinement criterion recursively. The hierarchical meshes are maintained by the special data structure and are modified in accordance with the change of particle distribution. In general, as the resolution of the simulation increases, its time step must be shortened and more computational time is required to complete the simulation. Since the AMR enhances the spatial resolution locally, we reduce the time step locally also, instead of shortening it globally. For this purpose we used a technique of hierarchical time steps (HTS) which changes the time step, from particle to particle, depending on the size of the cell in which particles reside. Some test calculations show that our implementation of AMR and HTS is successful. We have performed cosmological simulation runs based on our code and found that many of halo objects have density profiles which are well fitted to the universal profile proposed by Navarro, Frenk, & White (1996) over the entire range of their radius.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا