No Arabic abstract
Recently, a 4th-order asymptotic preserving multiderivative implicit-explicit (IMEX) scheme was developed (Schutz and Seal 2020, arXiv:2001.08268). This scheme is based on a 4th-order Hermite interpolation in time, and uses an approach based on operator splitting that converges to the underlying quadrature if iterated sufficiently. Hermite schemes have been used in astrophysics for decades, particularly for N-body calculations, but not in a form suitable for solving stiff equations. In this work, we extend the scheme presented in Schutz and Seal 2020 to higher orders. Such high-order schemes offer advantages when one aims to find high-precision solutions to systems of differential equations containing stiff terms, which occur throughout the physical sciences. We begin by deriving Hermite schemes of arbitrary order and discussing the stability of these formulas. Afterwards, we demonstrate how the method of Schutz and Seal 2020 generalises in a straightforward manner to any of these schemes, and prove convergence properties of the resulting IMEX schemes. We then present results for methods ranging from 6th to 12th order and explore a selection of test problems, including both linear and nonlinear ordinary differential equations and Burgers equation. To our knowledge this is also the first time that Hermite time-stepping methods have been applied to partial differential equations. We then discuss some benefits of these schemes, such as their potential for parallelism and low memory usage, as well as limitations and potential drawbacks.
We develop new numerical schemes for Vlasov--Poisson equations with high-order accuracy. Our methods are based on a spatially monotonicity-preserving (MP) scheme and are modified suitably so that positivity of the distribution function is also preserved. We adopt an efficient semi-Lagrangian time integration scheme that is more accurate and computationally less expensive than the three-stage TVD Runge-Kutta integration. We apply our spatially fifth- and seventh-order schemes to a suite of simulations of collisionless self-gravitating systems and electrostatic plasma simulations, including linear and nonlinear Landau damping in one dimension and Vlasov--Poisson simulations in a six-dimensional phase space. The high-order schemes achieve a significantly improved accuracy in comparison with the third-order positive-flux-conserved scheme adopted in our previous study. With the semi-Lagrangian time integration, the computational cost of our high-order schemes does not significantly increase, but remains roughly the same as that of the third-order scheme. Vlasov--Poisson simulations on $128^3 times 128^3$ mesh grids have been successfully performed on a massively parallel computer.
In most of mesh-free methods, the calculation of interactions between sample points or particles is the most time consuming. When we use mesh-free methods with high spatial orders, the order of the time integration should also be high. If we use usual Runge-Kutta schemes, we need to perform the interaction calculation multiple times per one time step. One way to reduce the number of interaction calculations is to use Hermite schemes, which use the time derivatives of the right hand side of differential equations, since Hermite schemes require smaller number of interaction calculations than RK schemes do to achieve the same order. In this paper, we construct a Hermite scheme for a mesh-free method with high spatial orders. We performed several numerical tests with fourth-order Hermite schemes and Runge-Kutta schemes. We found that, for both of Hermite and Runge-Kutta schemes, the overall error is determined by the error of spatial derivatives, for timesteps smaller than the stability limit. The calculation cost at the timestep size of the stability limit is smaller for Hermite schemes. Therefore, we conclude that Hermite schemes are more efficient than Runge-Kutta schemes and thus useful for high-order mesh-free methods for Lagrangian Hydrodynamics.
Building on the framework of Zhang & Shu cite{zhangShu_2010a,zhangShu_2010b}, we develop a realizability-preserving method to simulate the transport of particles (fermions) through a background material using a two-moment model that evolves the angular moments of a phase space distribution function $f$. The two-moment model is closed using algebraic moment closures; e.g., as proposed by Cernohorsky & Bludman cite{cernohorskyBludman_1994} and Banach & Larecki cite{banachLarecki_2017a}. Variations of this model have recently been used to simulate neutrino transport in nuclear astrophysics applications, including core-collapse supernovae and compact binary mergers. We employ the discontinuous Galerkin (DG) method for spatial discretization (in part to capture the asymptotic diffusion limit of the model) combined with implicit-explicit (IMEX) time integration to stably bypass short timescales induced by frequent interactions between particles and the background. Appropriate care is taken to ensure the method preserves strict algebraic bounds on the evolved moments (particle density and flux) as dictated by Paulis exclusion principle, which demands a bounded distribution function (i.e., $fin[0,1]$). This realizability-preserving scheme combines a suitable CFL condition, a realizability-enforcing limiter, a closure procedure based on Fermi-Dirac statistics, and an IMEX scheme whose stages can be written as a convex combination of forward Euler steps combined with a backward Euler step. Numerical results demonstrate the realizability-preserving properties of the scheme. We also demonstrate that the use of algebraic moment closures not based on Fermi-Dirac statistics can lead to unphysical moments in the context of fermion transport.
Mesh-free methods have significant potential for simulations in complex geometries, as the time consuming process of mesh-generation is avoided. Smoothed Particle Hydrodynamics (SPH) is the most widely used mesh-free method, but suffers from a lack of consistency. High order, consistent, and local (using compact computational stencils) mesh-free methods are particularly desirable. Here we present a novel framework for generating local high order difference operators for arbitrary node distributions, referred to as the Local Anisotropic Basis Function Method (LABFM). Weights are constructed from linear sums of anisotropic basis functions (ABFs), chosen to ensure exact reproduction of polynomial fields up to a given order. The ABFs are based on a fundamental Radial Basis Function (RBF), and the choice of fundamental RBF has small effect on accuracy, but influences stability. LABFM is able to generate high order difference operators with compact computational stencils (4th order with 25 nodes, 8th order with 60 nodes in two dimensions). At domain boundaries (with incomplete support) LABFM automatically provides one-sided differences of the same order as the internal scheme, up to 4th order. We use the method to solve elliptic, parabolic and mixed hyperbolic-parabolic PDEs, showing up to 8th order convergence. The inclusion of hyperviscosity is straightforward, and can effectively provide stability when solving hyperbolic problems. LABFM is a promising new mesh-free method for the numerical solution of PDEs in complex geometries. The method is highly scalable, and for Eulerian schemes, the computational efficiency is competitive with RBF-FD for a given accuracy. A particularly attractive feature is that in the low order limit, LABFM collapses to SPH, and there is potential for Arbitrary Lagrangian-Eulerian schemes with natural adaptivity of resolution and accuracy.
We focus on implementing and optimizing a sixth-order finite-difference solver for simulating compressible fluids on a GPU using third-order Runge-Kutta integration. Since graphics processing units perform well in data-parallel tasks, this makes them an attractive platform for fluid simulation. However, high-order stencil computation is memory-intensive with respect to both main memory and the caches of the GPU. We present two approaches for simulating compressible fluids using 55-point and 19-point stencils. We seek to reduce the requirements for memory bandwidth and cache size in our methods by using cache blocking and decomposing a latency-bound kernel into several bandwidth-bound kernels. Our fastest implementation is bandwidth-bound and integrates $343$ million grid points per second on a Tesla K40t GPU, achieving a $3.6 times$ speedup over a comparable hydrodynamics solver benchmarked on two Intel Xeon E5-2690v3 processors. Our alternative GPU implementation is latency-bound and achieves the rate of $168$ million updates per second.