No Arabic abstract
We have developed a simulation code with the techniques which enhance both spatial and time resolution of the PM method for which the spatial resolution is restricted by the spacing of structured mesh. The adaptive mesh refinement (AMR) technique subdivides the cells which satisfy the refinement criterion recursively. The hierarchical meshes are maintained by the special data structure and are modified in accordance with the change of particle distribution. In general, as the resolution of the simulation increases, its time step must be shortened and more computational time is required to complete the simulation. Since the AMR enhances the spatial resolution locally, we reduce the time step locally also, instead of shortening it globally. For this purpose we used a technique of hierarchical time steps (HTS) which changes the time step, from particle to particle, depending on the size of the cell in which particles reside. Some test calculations show that our implementation of AMR and HTS is successful. We have performed cosmological simulation runs based on our code and found that many of halo objects have density profiles which are well fitted to the universal profile proposed by Navarro, Frenk, & White (1996) over the entire range of their radius.
In this paper, we describe our vectorized and parallelized adaptive mesh refinement (AMR) N-body code with shared time steps, and report its performance on a Fujitsu VPP5000 vector-parallel supercomputer. Our AMR N-body code puts hierarchical meshes recursively where higher resolution is required and the time step of all particles are the same. The parts which are the most difficult to vectorize are loops that access the mesh data and particle data. We vectorized such parts by changing the loop structure, so that the innermost loop steps through the cells instead of the particles in each cell, in other words, by changing the loop order from the depth-first order to the breadth-first order. Mass assignment is also vectorizable using this loop order exchange and splitting the loop into $2^{N_{dim}}$ loops, if the cloud-in-cell scheme is adopted. Here, $N_{dim}$ is the number of dimension. These vectorization schemes which eliminate the unvectorized loops are applicable to parallelization of loops for shared-memory multiprocessors. We also parallelized our code for distributed memory machines. The important part of parallelization is data decomposition. We sorted the hierarchical mesh data by the Morton order, or the recursive N-shaped order, level by level and split and allocated the mesh data to the processors. Particles are allocated to the processor to which the finest refined cells including the particles are also assigned. Our timing analysis using the $Lambda$-dominated cold dark matter simulations shows that our parallel code speeds up almost ideally up to 32 processors, the largest number of processors in our test.
This paper describes the open-source code Enzo, which uses block-structured adaptive mesh refinement to provide high spatial and temporal resolution for modeling astrophysical fluid flows. The code is Cartesian, can be run in 1, 2, and 3 dimensions, and supports a wide variety of physics including hydrodynamics, ideal and non-ideal magnetohydrodynamics, N-body dynamics (and, more broadly, self-gravity of fluids and particles), primordial gas chemistry, optically-thin radiative cooling of primordial and metal-enriched plasmas (as well as some optically-thick cooling models), radiation transport, cosmological expansion, and models for star formation and feedback in a cosmological context. In addition to explaining the algorithms implemented, we present solutions for a wide range of test problems, demonstrate the codes parallel performance, and discuss the Enzo collaborations code development methodology.
We present the newly developed code, GAMER (GPU-accelerated Adaptive MEsh Refinement code), which has adopted a novel approach to improve the performance of adaptive mesh refinement (AMR) astrophysical simulations by a large factor with the use of the graphic processing unit (GPU). The AMR implementation is based on a hierarchy of grid patches with an oct-tree data structure. We adopt a three-dimensional relaxing TVD scheme for the hydrodynamic solver, and a multi-level relaxation scheme for the Poisson solver. Both solvers have been implemented in GPU, by which hundreds of patches can be advanced in parallel. The computational overhead associated with the data transfer between CPU and GPU is carefully reduced by utilizing the capability of asynchronous memory copies in GPU, and the computing time of the ghost-zone values for each patch is made to diminish by overlapping it with the GPU computations. We demonstrate the accuracy of the code by performing several standard test problems in astrophysics. GAMER is a parallel code that can be run in a multi-GPU cluster system. We measure the performance of the code by performing purely-baryonic cosmological simulations in different hardware implementations, in which detailed timing analyses provide comparison between the computations with and without GPU(s) acceleration. Maximum speed-up factors of 12.19 and 10.47 are demonstrated using 1 GPU with 4096^3 effective resolution and 16 GPUs with 8192^3 effective resolution, respectively.
A computer code is described for the simulation of gravitational lensing data. The code incorporates adaptive mesh refinement in choosing which rays to shoot based on the requirements of the source size, location and surface brightness distribution or to find critical curves/caustics. A variety of source surface brightness models are implemented to represent galaxies and quasar emission regions. The lensing mass can be represented by point masses (stars), smoothed simulation particles, analytic halo models, pixelized mass maps or any combination of these. The deflection and beam distortions (convergence and shear) are calculated by modified tree algorithm when halos, point masses or particles are used and by FFT when mass maps are used. The combination of these methods allow for a very large dynamical range to be represented in a single simulation. Individual images of galaxies can be represented in a simulation that covers many square degrees. For an individual strongly lensed quasar, source sizes from the size of the quasars host galaxy (~ 100 kpc) down to microlensing scales (~ 10^-4 pc) can be probed in a self consistent simulation. Descriptions of various tests of the codes accuracy are given.
A new N-body and hydrodynamical code, called RAMSES, is presented. It has been designed to study structure formation in the universe with high spatial resolution. The code is based on Adaptive Mesh Refinement (AMR) technique, with a tree based data structure allowing recursive grid refinements on a cell-by-cell basis. The N-body solver is very similar to the one developed for the ART code (Kravtsov et al. 97), with minor differences in the exact implementation. The hydrodynamical solver is based on a second-order Godunov method, a modern shock-capturing scheme known to compute accurately the thermal history of the fluid component. The accuracy of the code is carefully estimated using various test cases, from pure gas dynamical tests to cosmological ones. The specific refinement strategy used in cosmological simulations is described, and potential spurious effects associated to shock waves propagation in the resulting AMR grid are discussed and found to be negligible. Results obtained in a large N-body and hydrodynamical simulation of structure formation in a low density LCDM universe are finally reported, with 256^3 particles and 4.1 10^7 cells in the AMR grid, reaching a formal resolution of 8192^3. A convergence analysis of different quantities, such as dark matter density power spectrum, gas pressure power spectrum and individual haloes temperature profiles, shows that numerical results are converging down to the actual resolution limit of the code, and are well reproduced by recent analytical predictions in the framework of the halo model.