No Arabic abstract
The JOREK extended magneto-hydrodynamic (MHD) code is a widely used simulation code for studying the non-linear dynamics of large-scale instabilities in divertor tokamak plasmas. Due to the large scale-separation intrinsic to these phenomena both in space and time, the computational costs for simulations in realistic geometry and with realistic parameters can be very high, motivating the investment of considerable effort for optimization. In this article, a set of developments regarding the JOREK solver and preconditioner is described, which lead to overall significant benefits for large production simulations. This comprises in particular enhanced convergence in highly non-linear scenarios and a general reduction of memory consumption and computational costs. The developments include faster construction of preconditioner matrices, a domain decomposition of preconditioning matrices for solver libraries that can handle distributed matrices, interfaces for additional solver libraries, an option to use matrix compression methods, and the implementation of a complex solver interface for the preconditioner. The most significant development presented consists in a generalization of the physics based preconditioner to mode groups, which allows to account for the dominant interactions between toroidal Fourier modes in highly non-linear simulations. At the cost of a moderate increase of memory consumption, the technique can strongly enhance convergence in suitable cases allowing to use significantly larger time steps. For all developments, benchmarks based on typical simulation cases demonstrate the resulting improvements.
A high-order method to evolve in time electromagnetic and velocity fields in conducting fluids with non-periodic boundaries is presented. The method has a small overhead compared with fast FFT-based pseudospectral methods in periodic domains. It uses the magnetic vector potential formulation for accurately enforcing the null divergence of the magnetic field, and allowing for different boundary conditions including perfectly conducting walls or vacuum surroundings, two cases relevant for many astrophysical, geophysical, and industrial flows. A spectral Fourier continuation method is used to accurately represent all fields and their spatial derivatives, allowing also for efficient solution of Poisson equations with different boundaries. A study of conducting flows at different Reynolds and Hartmann numbers, and with different boundary conditions, is presented to study convergence of the method and the accuracy of the solenoidal and boundary conditions.
A customized finite-difference field solver for the particle-in-cell (PIC) algorithm that provides higher fidelity for wave-particle interactions in intense electromagnetic waves is presented. In many problems of interest, particles with relativistic energies interact with intense electromagnetic fields that have phase velocities near the speed of light. Numerical errors can arise due to (1) dispersion errors in the phase velocity of the wave, (2) the staggering in time between the electric and magnetic fields and between particle velocity and position and (3) errors in the time derivative in the momentum advance. Errors of the first two kinds are analyzed in detail. It is shown that by using field solvers with different $mathbf{k}$-space operators in Faradays and Amperes law, the dispersion errors and magnetic field time-staggering errors in the particle pusher can be simultaneously removed for electromagnetic waves moving primarily in a specific direction. The new algorithm was implemented into OSIRIS by using customized higher-order finite-difference operators. Schemes using the proposed solver in combination with different particle pushers are compared through PIC simulation. It is shown that the use of the new algorithm, together with an analytic particle pusher (assuming constant fields over a time step), can lead to accurate modeling of the motion of a single electron in an intense laser field with normalized vector potentials, $eA/mc^2$, exceeding $10^4$ for typical cell sizes and time steps.
Furthering our understanding of many of todays interesting problems in plasma physics---including plasma based acceleration and magnetic reconnection with pair production due to quantum electrodynamic effects---requires large-scale kinetic simulations using particle-in-cell (PIC) codes. However, these simulations are extremely demanding, requiring that contemporary PIC codes be designed to efficiently use a new fleet of exascale computing architectures. To this end, the key issue of parallel load balance across computational nodes must be addressed. We discuss the implementation of dynamic load balancing by dividing the simulation space into many small, self-contained regions or tiles, along with shared-memory (e.g., OpenMP) parallelism both over many tiles and within single tiles. The load balancing algorithm can be used with three different topologies, including two space-filling curves. We tested this implementation in the code OSIRIS and show low overhead and improved scalability with OpenMP thread number on simulations with both uniform load and severe load imbalance. Compared to other load-balancing techniques, our algorithm gives order-of-magnitude improvement in parallel scalability for simulations with severe load imbalance issues.
We study the algorithmic optimization and performance tuning of the Lattice QCD clover-fermion solver for the K computer. We implement the Luschers SAP preconditioner with sub-blocking in which the lattice block in a node is further divided to several sub-blocks to extract enough parallelism for the 8-core CPU SPARC64$^{mathrm{TM}}$ VIIIfx of the K computer. To achieve a better convergence property we use the symmetric successive over-relaxation (SSOR) iteration with {it locally-lexicographical} ordering for the sub-blocks in obtaining the block inverse. The SAP preconditioner is included in the single precision BiCGStab solver of the nested BiCGStab solver. The single precision part of the computational kernel are solely written with the SIMD oriented intrinsics to achieve the best performance of the SPARC on the K computer. We benchmark the single precision BiCGStab solver on the three lattice sizes: $12^3times 24$, $24^3times 48$ and $48^3times 96$, with fixing the local lattice size in a node at $6^3times 12$. We observe an ideal weak-scaling performance from 16 nodes to 4096 nodes. The performance of a computational kernel exceeds 50% efficiency, and the single precision BiCGstab has $sim26% susutained efficiency.
We introduce an open-source package called QTraj that solves the Lindblad equation for heavy-quarkonium dynamics using the quantum trajectories algorithm. The package allows users to simulate the suppression of heavy-quarkonium states using externally-supplied input from 3+1D hydrodynamics simulations. The code uses a split-step pseudo-spectral method for updating the wave-function between jumps, which is implemented using the open-source multi-threaded FFTW3 package. This allows one to have manifestly unitary evolution when using real-valued potentials. In this paper, we provide detailed documentation of QTraj 1.0, installation instructions, and present various tests and benchmarks of the code.