بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

COFFEE: an Optimizing Compiler for Finite Element Local Assembly

252 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Fabio Luporini

تاريخ النشر 2014

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Fabio Luporini - Ana Lucia Varbanescu - Florian Rathgeber

البرمجيات الرياضية الهندسة الحاسوبية، المالية،العلوم الأداء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The numerical solution of partial differential equations using the finite element method is one of the key applications of high performance computing. Local assembly is its characteristic operation. This entails the execution of a problem-specific kernel to numerically evaluate an integral for each element in the discretized problem domain. Since the domain size can be huge, executing efficient kernels is fundamental. Their op- timization is, however, a challenging issue. Even though affine loop nests are generally present, the short trip counts and the complexity of mathematical expressions make it hard to determine a single or unique sequence of successful transformations. Therefore, we present the design and systematic evaluation of COF- FEE, a domain-specific compiler for local assembly kernels. COFFEE manipulates abstract syntax trees generated from a high-level domain-specific language for PDEs by introducing domain-aware composable optimizations aimed at improving instruction-level parallelism, especially SIMD vectorization, and register locality. It then generates C code including vector intrinsics. Experiments using a range of finite-element forms of increasing complexity show that significant performance improvement is achieved.

قيم البحث

61 - Fabio Luporini , David A. Ham , Paul H. J. Kelly 2016

We present an algorithm for the optimization of a class of finite element integration loop nests. This algorithm, which exploits fundamental mathematical properties of finite element operators, is proven to achieve a locally optimal operation count. In specified circumstances the optimum achieved is global. Extensive numerical experiments demonstrate significant performance improvements over the state of the art in finite element code generation in almost all cases. This validates the effectiveness of the algorithm presented here, and illustrates its limitations.

البرمجيات الرياضية

A local-global generalized multiscale finite element method for highly heterogeneous stochastic groundwater flow problems

118 - Yiran Wang , Eric Chung , Shubin Fu 2021

In this paper, we propose a local-global multiscale method for highly heterogeneous stochastic groundwater flow problems under the framework of reduced basis method and the generalized multiscale finite element method (GMsFEM). Due to incomplete char acterization of the medium properties of the groundwater flow problems, random variables are used to parameterize the uncertainty. As a result, solving the problem repeatedly is required to obtain statistical quantities. Besides, the medium properties are usually highly heterogeneous, which will result in a large linear system that needs to be solved. Therefore, it is intrinsically inevitable to seek a computational-efficient model reduction method to overcome the difficulty. We will explore the combination of the reduced basis method and the GMsFEM. In particular, we will use residual-driven basis functions, which are key ingredients in GMsFEM. This local-global multiscale method is more efficient than applying the GMsFEM or reduced basis method individually. We first construct parameter-independent multiscale basis functions that include both local and global information of the permeability fields, and then use these basis functions to construct several global snapshots and global basis functions for fast online computation with different parameter inputs. We provide rigorous analysis of the proposed method and extensive numerical examples to demonstrate the accuracy and efficiency of the local-global multiscale method.

التحليل العددي الهندسة الحاسوبية، المالية،العلوم التحليل العددي

Code generation for productive portable scalable finite element simulation in Firedrake

69 - Jack D. Betteridge , Patrick E. Farrell , David A. Ham 2021

Creating scalable, high performance PDE-based simulations requires a suitable combination of discretizations, differential operators, preconditioners and solvers. The required combination changes with the application and with the available hardware, yet software development time is a severely limited resource for most scientists and engineers. Here we demonstrate that generating simulation code from a high-level Python interface provides an effective mechanism for creating high performance simulations from very few lines of user code. We demonstrate that moving from one supercomputer to another can require significant algorithmic changes to achieve scalable performance, but that the code generation approach enables these algorithmic changes to be achieved with minimal development effort.

البرمجيات الرياضية

Finite Element Integration with Quadrature on the GPU

81 - Matthew G. Knepley , Karl Rupp , Andy R. Terrel 2016

We present a novel, quadrature-based finite element integration method for low-order elements on GPUs, using a pattern we call textit{thread transposition} to avoid reductions while vectorizing aggressively. On the NVIDIA GTX580, which has a nominal single precision peak flop rate of 1.5 TF/s and a memory bandwidth of 192 GB/s, we achieve close to 300 GF/s for element integration on first-order discretization of the Laplacian operator with variable coefficients in two dimensions, and over 400 GF/s in three dimensions. From our performance model we find that this corresponds to 90% of our measured achievable bandwidth peak of 310 GF/s. Further experimental results also match the predicted performance when used with double precision (120 GF/s in two dimensions, 150 GF/s in three dimensions). Results obtained for the linear elasticity equations (220 GF/s and 70 GF/s in two dimensions, 180 GF/s and 60 GF/s in three dimensions) also demonstrate the applicability of our method to vector-valued partial differential equations.

البرمجيات الرياضية

An Open-Source, Industrial-Strength Optimizing Compiler for Quantum Programs

58 - Robert S. Smith , Eric C. Peterson , Mark G. Skilbeck 2020

Quilc is an open-source, optimizing compiler for gate-based quantum programs written in Quil or QASM, two popular quantum programming languages. The compiler was designed with attention toward NISQ-era quantum computers, specifically recognizing that each quantum gate has a non-negligible and often irrecoverable cost toward a programs successful execution. Quilcs primary goal is to make authoring quantum software a simpler exercise by making architectural details less burdensome to the author. Using Quilc allows one to write programs faster while usually not compromising---and indeed sometimes improving---their execution fidelity on a given hardware architecture. In this paper, we describe many of the principles behind Quilcs design, and demonstrate the compiler with various examples.

فيزياء الكم لغات البرمجة

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة أسيوط

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

COFFEE: an Optimizing Compiler for Finite Element Local Assembly

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً