New community

Subscribe to the gold package and get unlimited access to Shamra Academy

DD-$alpha$AMG on QPACE 3

172 0 0.0 ( 0 )

Download Cite

Added by Tilo Wettig

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Peter Georg - Daniel Richtmann - Tilo Wettig

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We describe our experience porting the Regensburg implementation of the DD-$alpha$AMG solver from QPACE 2 to QPACE 3. We first review how the code was ported from the first generation Intel Xeon Phi processor (Knights Corner) to its successor (Knights Landing). We then describe the modifications in the communication library necessitated by the switch from InfiniBand to Omni-Path. Finally, we present the performance of the code on a single processor as well as the scaling on many nodes, where in both cases the speedup factor is close to the theoretical expectations.

rate research

Lattice QCD Applications on QPACE

146 - Y. Nakamura , A. Nobile , D. Pleiter 2011

QPACE is a novel massively parallel architecture optimized for lattice QCD simulations. A single QPACE node is based on the IBM PowerXCell 8i processor. The nodes are interconnected by a custom 3-dimensional torus network implemented on an FPGA. The compute power of the processor is provided by 8 Synergistic Processing Units. Making efficient use of these accelerator cores in scientific applications is challenging. In this paper we describe our strategies for porting applications to the QPACE architecture and report on performance numbers.

High Energy Physics - Lattice

Solving the Dirac equation on QPACE

118 - Andrea Nobile 2011

We discuss the implementation and optimization challenges for a Wilson-Dirac solver with Clover term on QPACE, a parallel machine based on Cell processors and a torus network. We choose the mixed-precision Schwarz preconditioned FGCR algorithm in order to circumvent network bandwidth and latency constraints, to make efficient use of the multicore parallelism and on-chip memory, and to achieve flexibility in the choice of lattice sizes. We present benchmarks on up to 256 QPACE nodes showing an aggregate sustained performance of about 10 TFlops for the complete solver and very good scaling.

High Energy Physics - Lattice

QPACE -- a QCD parallel computer based on Cell processors

120 - H. Baier , H. Boettiger , M. Drochner 2009

QPACE is a novel parallel computer which has been developed to be primarily used for lattice QCD simulations. The compute power is provided by the IBM PowerXCell 8i processor, an enhanced version of the Cell processor that is used in the Playstation 3. The QPACE nodes are interconnected by a custom, application optimized 3-dimensional torus network implemented on an FPGA. To achieve the very high packaging density of 26 TFlops per rack a new water cooling concept has been developed and successfully realized. In this paper we give an overview of the architecture and highlight some important technical details of the system. Furthermore, we provide initial performance results and report on the installation of 8 QPACE racks providing an aggregate peak performance of 200 TFlops.

High Energy Physics - Lattice Hardware Architecture

Status of the QPACE Project

154 - H. Baier , H. Boettiger , M. Drochner 2008

We give an overview of the QPACE project, which is pursuing the development of a massively parallel, scalable supercomputer for LQCD. The machine is a three-dimensional torus of identical processing nodes, based on the PowerXCell 8i processor. The nodes are connected by an FPGA-based, application-optimized network processor attached to the PowerXCell 8i processor. We present a performance analysis of lattice QCD codes on QPACE and corresponding hardware benchmarks.

High Energy Physics - Lattice

Gauge Field Generation on Large-Scale GPU-Enabled Systems

239 - Frank Winter 2012

Over the past years GPUs have been successfully applied to the task of inverting the fermion matrix in lattice QCD calculations. Even strong scaling to capability-level supercomputers, corresponding to O(100) GPUs or more has been achieved. However strong scaling a whole gauge field generation algorithm to this regim requires significantly more functionality than just having the matrix inverter utilizing the GPUs and has not yet been accomplished. This contribution extends QDP-JIT, the migration of SciDAC QDP++ to GPU-enabled parallel systems, to help to strong scale the whole Hybrid Monte-Carlo to this regime. Initial results are shown for gauge field generation with Chroma simulating pure Wilson fermions on OLCF TitanDev.

High Energy Physics - Lattice Distributed Parallel and Cluster Computing Computational Physics

comments

Fetching comments

Yarmouk Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

DD-$alpha$AMG on QPACE 3

Ask ChatGPT about the research

No Arabic abstract

Read More