ﻻ يوجد ملخص باللغة العربية
Graphics Processing Unit (GPU) computing is becoming an alternate computing platform for numerical simulations. However, it is not clear which numerical scheme will provide the highest computational efficiency for different types of problems. To this end, numerical accuracies and computational work of several numerical methods are compared using a GPU computing implementation. The Correction Procedure via Reconstruction (CPR), Discontinuous Galerkin (DG), Nodal Discontinuous Galerkin (NDG), Spectral Difference (SD), and Finite Volume (FV) methods are investigated using various reconstruction orders. Both smooth and discontinuous cases are considered for two-dimensional simulations. For discontinuous problems, MUSCL schemes are employed with FV, while CPR, DG, NDG, and SD use slope limiting. The computation time to reach a set error criteria and total time to complete solutions are compared across the methods. It is shown that while FV methods can produce solutions with low computational times, they produce larger errors than high-order methods for smooth problems at the same order of accuracy. For discontinuous problems, the methods show good agreement with one another in terms of solution profiles, and the total computational times between FV, CPR, and SD are comparable.
We implement exact triangle counting in graphs on the GPU using three different methodologies: subgraph matching to a triangle pattern; programmable graph analytics, with a set-intersection approach; and a matrix formulation based on sparse matrix-ma
Multisplit is a broadly useful parallel primitive that permutes its input data into contiguous buckets or bins, where the function that categorizes an element into a bucket is provided by the programmer. Due to the lack of an efficient multisplit on
We propose a Hermite spectral method for the spatially inhomogeneous Boltzmann equation. For the inverse-power-law model, we generalize an approximate quadratic collision operator defined in the normalized and dimensionless setting to an operator for
We review several parallel tempering schemes and examine their main ingredients for accuracy and efficiency. The present study covers two selection methods of temperatures and several choices for the exchange of replicas, including a recent novel all
Due to the surge in the volume of data generated and rapid advancement in Artificial Intelligence (AI) techniques like machine learning and deep learning, the existing traditional computing models have become inadequate to process an enormous volume