No Arabic abstract
Massive parallelisation has lead to a dramatic increase in available computational power. However, data transfer speeds have failed to keep pace and are the major limiting factor in the development of exascale computing. New algorithms must be developed which minimise the transfer of data. Patch dynamics is a computational macroscale modelling scheme which provides a coarse macroscale solution of a problem defined on a fine microscale by dividing the domain into many nonoverlapping, coupled patches. Patch dynamics is readily adaptable to massive parallelisation as each processor can evaluate the dynamics on one, or a few, patches. However, patch coupling conditions interpolate across the unevaluated parts of the domain between patches, and are typically reevaluated at every microscale time step, thus requiring almost continuous data transfer. We propose a modified patch dynamics scheme which minimises data transfer by only reevaluating the patch coupling conditions at `mesoscale time scales which are significantly larger than the microscale time of the microscale problem. We analyse the error arising from patch dynamics with mesoscale temporal coupling as a function of the mesoscale time interval, patch size, and ratio between the microscale and macroscale.
Equation-free macroscale modelling is a systematic and rigorous computational methodology for efficiently predicting the dynamics of a microscale system at a desired macroscale system level. In this scheme, the given microscale model is computed in small patches spread across the space-time domain, with patch coupling conditions bridging the unsimulated space. For accurate simulations, care must be taken in designing the patch coupling conditions. Here we construct novel coupling conditions which preserve translational invariance, rotational invariance, and self-adjoint symmetry, thus guaranteeing that conservation laws associated with these symmetries are preserved in the macroscale simulation. Spectral and algebraic analyses of the proposed scheme in both one and two dimensions reveal mechanisms for further improving the accuracy of the simulations. Consistency of the patch schemes macroscale dynamics with the original microscale model is proved. This new self-adjoint patch scheme provides an efficient, flexible, and accurate computational homogenisation in a wide range of multiscale scenarios of interest to scientists and engineers.
We present an efficient open-source implementation of the multiparticle collision dynamics (MPCD) algorithm that scales to run on hundreds of graphics processing units (GPUs). We especially focus on optimizations for modern GPU architectures and communication patterns between multiple GPUs. We show that a mixed-precision computing model can improve performance compared to a fully double-precision model while still providing good numerical accuracy. We report weak and strong scaling benchmarks of a reference MPCD solvent and a benchmark of a polymer solution with research-relevant interactions and system size. Our MPCD software enables simulations of mesoscale hydrodynamics at length and time scales that would be otherwise challenging or impossible to access.
In this paper we describe the research and development activities in the Center for Efficient Exascale Discretization within the US Exascale Computing Project, targeting state-of-the-art high-order finite-element algorithms for high-order applications on GPU-accelerated platforms. We discuss the GPU developments in several components of the CEED software stack, including the libCEED, MAGMA, MFEM, libParanumal, and Nek projects. We report performance and capability improvements in several CEED-enabled applications on both NVIDIA and AMD GPU systems.
Intermittent maps of the interval are simple and widely-studied models for chaos with slow mixing rates, but have been notoriously resistant to numerical study. In this paper we present an effective framework to compute many ergodic properties of these systems, in particular invariant measures and mean return times. The framework combines three ingredients that each harness the smooth structure of these systems induced maps: Abel functions to compute the action of the induced maps, Euler-Maclaurin summation to compute the pointwise action of their transfer operators, and Chebyshev Galerkin discretisations to compute the spectral data of the transfer operators. The combination of these techniques allows one to obtain exponential convergence of estimates for polynomially growing computational outlay, independent of the order of the maps neutral fixed point. This enables numerical exploration of intermittent dynamics in all parameter regimes, including in the infinite ergodic regime.
This paper studies the efficiency problem for visual transformers by excavating redundant calculation in given networks. The recent transformer architecture has demonstrated its effectiveness for achieving excellent performance on a series of computer vision tasks. However, similar to that of convolutional neural networks, the huge computational cost of vision transformers is still a severe issue. Considering that the attention mechanism aggregates different patches layer-by-layer, we present a novel patch slimming approach that discards useless patches in a top-down paradigm. We first identify the effective patches in the last layer and then use them to guide the patch selection process of previous layers. For each layer, the impact of a patch on the final output feature is approximated and patches with less impact will be removed. Experimental results on benchmark datasets demonstrate that the proposed method can significantly reduce the computational costs of vision transformers without affecting their performances. For example, over 45% FLOPs of the ViT-Ti model can be reduced with only 0.2% top-1 accuracy drop on the ImageNet dataset.