Manycore parallel computing for a hybridizable discontinuous Galerkin nested multigrid method


Abstract in English

We present a parallel computing strategy for a hybridizable discontinuous Galerkin (HDG) nested geometric multigrid (GMG) solver. Parallel GMG solvers require a combination of coarse-grain and fine-grain parallelism to improve time to solution performance. In this work we focus on fine-grain parallelism. We use Intels second generation Xeon Phi (Knights Landing) many-core processor. The GMG method achieves ideal convergence rates of $0.2$ or less, for high polynomial orders. A matrix free (assembly free) technique is exploited to save considerable memory usage and increase arithmetic intensity. HDG enables static condensation, and due to the discontinuous nature of the discretization, we developed a matrix vector multiply routine that does not require any costly synchronizations or barriers. Our algorithm is able to attain 80% of peak bandwidth performance for higher order polynomials. This is possible due to the data locality inherent in the HDG method. Very high performance is realized for high order schemes, due to good arithmetic intensity, which declines as the order is reduced.

Download