ترغب بنشر مسار تعليمي؟ اضغط هنا

Adaptive algebraic multigrid on SIMD architectures

336   0   0.0 ( 0 )
 نشر من قبل Tilo Wettig
 تاريخ النشر 2015
  مجال البحث فيزياء
والبحث باللغة English




اسأل ChatGPT حول البحث

We present details of our implementation of the Wuppertal adaptive algebraic multigrid code DD-$alpha$AMG on SIMD architectures, with particular emphasis on the Intel Xeon Phi processor (KNC) used in QPACE 2. As a smoother, the algorithm uses a domain-decomposition-based solver code previously developed for the KNC in Regensburg. We optimized the remaining parts of the multigrid code and conclude that it is a very good target for SIMD architectures. Some of the remaining bottlenecks can be eliminated by vectorizing over multiple test vectors in the setup, which is discussed in the contribution of Daniel Richtmann.



قيم البحث

اقرأ أيضاً

In this paper, we present an efficient adaptive multigrid strategy for large-scale molecular mechanics optimization. The oneway multigrid method is used with inexact approximations, such as the quasi-atomistic (QA) approximation or the blended ghost force correction (BGFC) approximation on each coarse level, combined with adaptive mesh refinements based on the gradient-based a posteriori error estimator. For crystalline defects, like vacancies, micro-crack and dislocation, sublinear complexity is observed numerically when the adaptive BGFC method is employed. For systems with more than ten millions atoms, this strategy has a fivefold acceleration in terms of CPU time.
Convolution layers are prevalent in many classes of deep neural networks, including Convolutional Neural Networks (CNNs) which provide state-of-the-art results for tasks like image recognition, neural machine translation and speech recognition. The c omputationally expensive nature of a convolution operation has led to the proliferation of implementations including matrix-matrix multiplication formulation, and direct convolution primarily targeting GPUs. In this paper, we introduce direct convolution kernels for x86 architectures, in particular for Xeon and XeonPhi systems, which are implemented via a dynamic compilation approach. Our JIT-based implementation shows close to theoretical peak performance, depending on the setting and the CPU architecture at hand. We additionally demonstrate how these JIT-optimized kernels can be integrated into a lightweight multi-node graph execution model. This illustrates that single- and multi-node runs yield high efficiencies and high image-throughputs when executing state-of-the-art image recognition tasks on CPUs.
Efficient numerical solvers for sparse linear systems are crucial in science and engineering. One of the fastest methods for solving large-scale sparse linear systems is algebraic multigrid (AMG). The main challenge in the construction of AMG algorit hms is the selection of the prolongation operator -- a problem-dependent sparse matrix which governs the multiscale hierarchy of the solver and is critical to its efficiency. Over many years, numerous methods have been developed for this task, and yet there is no known single right answer except in very special cases. Here we propose a framework for learning AMG prolongation operators for linear systems with sparse symmetric positive (semi-) definite matrices. We train a single graph neural network to learn a mapping from an entire class of such matrices to prolongation operators, using an efficient unsupervised loss function. Experiments on a broad class of problems demonstrate improved convergence rates compared to classical AMG, demonstrating the potential utility of neural networks for developing sparse system solvers.
In recent contributions, algebraic multigrid methods have been designed and studied from the viewpoint of the spectral complementarity. In this note we focus our efforts on specific applications and, more precisely, on large linear systems arising fr om the approximation of weighted Laplacian with various boundary conditions. We adapt the multigrid idea to this specific setting and we present and critically discuss a wide numerical experimentation showing the potentiality of the considered approach.
For embedded boundary electromagnetics using the Dey-Mittra algorithm, a special grad-div matrix constructed in this work allows use of multigrid methods for efficient inversion of Maxwells curl-curl matrix. Efficient curl-curl
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا