Architecture and performance of Devito, a system for automated stencil computation

312 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Fabio Luporini

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Fabio Luporini - Michael Lange - Mathias Louboutin

البرمجيات الرياضية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Stencil computations are a key part of many high-performance computing applications, such as image processing, convolutional neural networks, and finite-difference solvers for partial differential equations. Devito is a framework capable of generating highly-optimized code given symbolic equations expressed in Python, specialized in, but not limited to, affine (stencil) codes. The lowering process---from mathematical equations down to C++ code---is performed by the Devito compiler through a series of intermediate representations. Several performance optimizations are introduced, including advanced common sub-expressions elimination, tiling and parallelization. Some of these are obtained through well-established stencil optimizers, integrated in the back-end of the Devito compiler. The architecture of the Devito compiler, as well as the performance optimizations that are applied when generating code, are presented. The effectiveness of such performance optimizations is demonstrated using operators drawn from seismic imaging applications.

قيم البحث

167 - Michael Lange , Navjot Kukreja , Mathias Louboutin 2016

Domain specific languages (DSL) have been used in a variety of fields to express complex scientific problems in a concise manner and provide automated performance optimization for a range of computational architectures. As such DSLs provide a powerfu l mechanism to speed up scientific Python computation that goes beyond traditional vectorization and pre-compilation approaches, while allowing domain scientists to build applications within the comforts of the Python software ecosystem. In this paper we present Devito, a new finite difference DSL that provides optimized stencil computation from high-level problem specifications based on symbolic Python expressions. We demonstrate Devitos symbolic API and performance advantages over traditional Python acceleration methods before highlighting its use in the scientific context of seismic inversion problems.

البرمجيات الرياضية الأداء

An Efficient Vectorization Scheme for Stencil Computation

107 - Kun Li , Liang Yuan , Yunquan Zhang 2021

Stencil computation is one of the most important kernels in various scientific and engineering applications. A variety of work has focused on vectorization and tiling techniques, aiming at exploiting the in-core data parallelism and data locality res pectively. In this paper, the downsides of existing vectorization schemes are analyzed. Briefly, they either incur data alignment conflicts or hurt the data locality when integrated with tiling. Then we propose a novel transpose layout to preserve the data locality for tiling and reduce the data reorganization overhead for vectorization simultaneously. To further improve the data reuse at the register level, a time loop unroll-and-jam strategy is designed to perform multistep stencil computation along the time dimension. Experimental results on the AVX-2 and AVX-512 CPUs show that our approach obtains a competitive performance.

النظم الموزعة والتوازية والحوسبة العنقودية

Migration in the Stencil Pluralist Cloud Architecture

269 - Tai Liu , Zain Tariq , Barath Raghavan 2021

A debate in the research community has buzzed in the background for years: should large-scale Internet services be centralized or decentralized? Now-common centralized cloud and web services have downsides -- user lock-in and loss of privacy and data control -- that are increasingly apparent. However, their decentralized counterparts have struggled to gain adoption, suffer from their own problems of scalability and trust, and eventually may result in the exact same lock-in they intended to prevent. In this paper, we explore the design of a pluralist cloud architecture, Stencil, one that can serve as a narrow waist for user-facing services such as social media. We aim to enable pluralism via a unifying set of abstractions that support migration from one service to a competing service. We find that migrating linked data introduces many challenges in both source and destination services as links are severed. We show how Stencil enables correct and efficient data migration between services, how it supports the deployment of new services, and how Stencil could be incrementally deployed.

بنية الشبكات والإنترنت أجهزة الكمبيوتر والمجتمع الشبكات الاجتماعية والمعلومات

Performance of Devito on HPC-Optimised ARM Processors

116 - Hermes Senger , Jaime F. de Souza , Edson S. Gomi 2019

We evaluate the performance of Devito, a domain specific language (DSL) for finite differences on Arm ThunderX2 processors. Experiments with two common seismic computational kernels demonstrate that Arm processors can deliver competitive performance compared to other Intel Xeon processors.

الأداء

OpenSBLI: A framework for the automated derivation and parallel execution of finite difference solvers on a range of computer architectures

316 - Christian T. Jacobs , Satya P. Jammy , Neil D. Sandham 2016

Exascale computing will feature novel and potentially disruptive hardware architectures. Exploiting these to their full potential is non-trivial. Numerical modelling frameworks involving finite difference methods are currently limited by the static n ature of the hand-coded discretisation schemes and repeatedly may have to be re-written to run efficiently on new hardware. In contrast, OpenSBLI uses code generation to derive the models code from a high-level specification. Users focus on the equations to solve, whilst not concerning themselves with the detailed implementation. Source-to-source translation is used to tailor the code and enable its execution on a variety of hardware.

البرمجيات الرياضية الحساب الرمزي هندسة البرمجيات

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة دمشق

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Architecture and performance of Devito, a system for automated stencil computation

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً