بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

DD-$alpha$AMG on QPACE 3

172 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Tilo Wettig

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Peter Georg - Daniel Richtmann - Tilo Wettig

فيزياء الطاقة العالية - شعرية النظم الموزعة والتوازية والحوسبة العنقودية الفيزياء الحسابية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We describe our experience porting the Regensburg implementation of the DD-$alpha$AMG solver from QPACE 2 to QPACE 3. We first review how the code was ported from the first generation Intel Xeon Phi processor (Knights Corner) to its successor (Knights Landing). We then describe the modifications in the communication library necessitated by the switch from InfiniBand to Omni-Path. Finally, we present the performance of the code on a single processor as well as the scaling on many nodes, where in both cases the speedup factor is close to the theoretical expectations.

قيم البحث

اقرأ أيضاً

Lattice QCD Applications on QPACE

376 - Y. Nakamura , A. Nobile , D. Pleiter 2011

QPACE is a novel massively parallel architecture optimized for lattice QCD simulations. A single QPACE node is based on the IBM PowerXCell 8i processor. The nodes are interconnected by a custom 3-dimensional torus network implemented on an FPGA. The compute power of the processor is provided by 8 Synergistic Processing Units. Making efficient use of these accelerator cores in scientific applications is challenging. In this paper we describe our strategies for porting applications to the QPACE architecture and report on performance numbers.

فيزياء الطاقة العالية - شعرية

Solving the Dirac equation on QPACE

341 - Andrea Nobile 2011

We discuss the implementation and optimization challenges for a Wilson-Dirac solver with Clover term on QPACE, a parallel machine based on Cell processors and a torus network. We choose the mixed-precision Schwarz preconditioned FGCR algorithm in ord er to circumvent network bandwidth and latency constraints, to make efficient use of the multicore parallelism and on-chip memory, and to achieve flexibility in the choice of lattice sizes. We present benchmarks on up to 256 QPACE nodes showing an aggregate sustained performance of about 10 TFlops for the complete solver and very good scaling.

فيزياء الطاقة العالية - شعرية

QPACE -- a QCD parallel computer based on Cell processors

462 - H. Baier , H. Boettiger , M. Drochner 2009

QPACE is a novel parallel computer which has been developed to be primarily used for lattice QCD simulations. The compute power is provided by the IBM PowerXCell 8i processor, an enhanced version of the Cell processor that is used in the Playstation 3. The QPACE nodes are interconnected by a custom, application optimized 3-dimensional torus network implemented on an FPGA. To achieve the very high packaging density of 26 TFlops per rack a new water cooling concept has been developed and successfully realized. In this paper we give an overview of the architecture and highlight some important technical details of the system. Furthermore, we provide initial performance results and report on the installation of 8 QPACE racks providing an aggregate peak performance of 200 TFlops.

فيزياء الطاقة العالية - شعرية هندسة العتاد

Status of the QPACE Project

435 - H. Baier , H. Boettiger , M. Drochner 2008

We give an overview of the QPACE project, which is pursuing the development of a massively parallel, scalable supercomputer for LQCD. The machine is a three-dimensional torus of identical processing nodes, based on the PowerXCell 8i processor. The no des are connected by an FPGA-based, application-optimized network processor attached to the PowerXCell 8i processor. We present a performance analysis of lattice QCD codes on QPACE and corresponding hardware benchmarks.

فيزياء الطاقة العالية - شعرية

Gauge Field Generation on Large-Scale GPU-Enabled Systems

505 - Frank Winter 2012

Over the past years GPUs have been successfully applied to the task of inverting the fermion matrix in lattice QCD calculations. Even strong scaling to capability-level supercomputers, corresponding to O(100) GPUs or more has been achieved. However s trong scaling a whole gauge field generation algorithm to this regim requires significantly more functionality than just having the matrix inverter utilizing the GPUs and has not yet been accomplished. This contribution extends QDP-JIT, the migration of SciDAC QDP++ to GPU-enabled parallel systems, to help to strong scale the whole Hybrid Monte-Carlo to this regime. Initial results are shown for gauge field generation with Chroma simulating pure Wilson fermions on OLCF TitanDev.

فيزياء الطاقة العالية - شعرية النظم الموزعة والتوازية والحوسبة العنقودية الفيزياء الحسابية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشھباء الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

DD-$alpha$AMG on QPACE 3

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً