LFRic: Meeting the challenges of scalability and performance portability in Weather and Climate models

84 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Chris Maynard

تاريخ النشر 2018

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف S.V. Adams - R.W. Ford - M. Hambley

النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper describes LFRic: the new weather and climate modelling system being developed by the UK Met Office to replace the existing Unified Model in preparation for exascale computing in the 2020s. LFRic uses the GungHo dynamical core and runs on a semi-structured cubed-sphere mesh. The design of the supporting infrastructure follows object orientated principles to facilitate modularity and the use of external libraries where possible. In particular, a `separation of concerns between the science code and parallel code is imposed to promote performance portability. An application called PSyclone, developed at the STFC Hartree centre, can generate the parallel code enabling deployment of a single source science code onto different machine architectures. This paper provides an overview of the scientific requirement, the design of the software infrastructure, and examples of PSyclone usage. Preliminary performance results show strong scaling and an indication that hybrid MPI/OpenMP performs better than pure MPI.

قيم البحث

94 - E. Calore , A. Gabbana , J. Kraus 2017

An increasingly large number of HPC systems rely on heterogeneous architectures combining traditional multi-core CPUs with power efficient accelerators. Designing efficient applications for these systems has been troublesome in the past as accelerato rs could usually be programmed using specific programming languages threatening maintainability, portability and correctness. Several new programming environments try to tackle this problem. Among them, OpenACC offers a high-level approach based on compiler directive clauses to mark regions of existing C, C++ or Fortran codes to run on accelerators. This approach directly addresses code portability, leaving to compilers the support of each different accelerator, but one has to carefully assess the relative costs of portable approaches versus computing efficiency. In this paper we address precisely this issue, using as a test-bench a massively parallel Lattice Boltzmann algorithm. We first describe our multi-node implementation and optimization of the algorithm, using OpenACC and MPI. We then benchmark the code on a variety of processors, including traditional CPUs and GPUs, and make accurate performance comparisons with other GPU implementations of the same algorithm using CUDA and OpenCL. We also asses the performance impact associated to portable programming, and the actual portability and performance-portability of OpenACC-based applications across several state-of-the- art architectures.

النظم الموزعة والتوازية والحوسبة العنقودية

An astronomical institutes perspective on meeting the challenges of the climate crisis

56 - Knud Jahnke , Christian Fendt , Morgan Fouesneau 2020

Analysing greenhouse gas emissions of an astronomical institute is a first step in reducing its environmental impact. Here, we break down the emissions of the Max Planck Institute for Astronomy in Heidelberg and propose measures for reductions.

الأجهزة والأساليب للزيئات الفيزياء الفلكية الفيزياء والمجتمع

The Impact of Distance on Performance and Scalability of Distributed Database Systems in Hybrid Clouds

167 - Yaser Mansouri , M. Ali Babar 2020

The increasing need for managing big data has led the emergence of advanced database management systems. There has been increased efforts aimed at evaluating the performance and scalability of NoSQL and Relational databases hosted by either private o r public cloud datacenters. However, there has been little work on evaluating the performance and scalability of these databases in hybrid clouds, where the distance between private and public cloud datacenters can be one of the key factors that can affect their performance. Hence, in this paper, we present a detailed evaluation of throughput, scalability, and VMs size vs. VMs number for six modern databases in a hybrid cloud, consisting of a private cloud in Adelaide and Azure based datacenter in Sydney, Mumbai, and Virginia regions. Based on results, as the distance between private and public clouds increases, the throughput performance of most databases reduces. Second, MongoDB obtains the best throughput performance, followed by MySQL C luster, whilst Cassandra exposes the most fluctuation in through performance. Third, vertical scalability improves the throughput of databases more than the horizontal scalability. Forth, exploiting bigger VMs rather than more VMs with less cores can increase throughput performance for Cassandra, Riak, and Redis.

النظم الموزعة والتوازية والحوسبة العنقودية

Achieving near native runtime performance and cross-platform performance portability for random number generation through SYCL interoperability

174 - Vincent R. Pascuzzi , Mehdi Goli 2021

High-performance computing (HPC) is a major driver accelerating scientific research and discovery, from quantum simulations to medical therapeutics. The growing number of new HPC systems coming online are being furnished with various hardware compone nts, engineered by competing industry entities, each having their own architectures and platforms to be supported. While the increasing availability of these resources is in many cases pivotal to successful science, even the largest collaborations lack the computational expertise required for maximal exploitation of current hardware capabilities. The need to maintain multiple platform-specific codebases further complicates matters, potentially adding a constraint on the number of machines that can be utilized. Fortunately, numerous programming models are under development that aim to facilitate software solutions for heterogeneous computing. In this paper, we leverage the SYCL programming model to demonstrate cross-platform performance portability across heterogeneous resources. We detail our NVIDIA and AMD random number generator extensions to the oneMKL open-source interfaces library. Performance portability is measured relative to platform-specific baseline applications executed on four major hardware platforms using two different compilers supporting SYCL. The utility of our extensions are exemplified in a real-world setting via a high-energy physics simulation application. We show the performance of implementations that capitalize on SYCL interoperability are at par with native implementations, attesting to the cross-platform performance portability of a SYCL-based approach to scientific codes.

النظم الموزعة والتوازية والحوسبة العنقودية البرمجيات الرياضية فيزياء الطاقة العالية - التجربة

A General Framework for Scalability and Performance Analysis of DHT Routing Systems

344 - Joseph S. Kong , Jesse S. A. Bridgewater , Vwani P. Roychowdhury 2006

In recent years, many DHT-based P2P systems have been proposed, analyzed, and certain deployments have reached a global scale with nearly one million nodes. One is thus faced with the question of which particular DHT system to choose, and whether som e are inherently more robust and scalable. Toward developing such a comparative framework, we present the reachable component method (RCM) for analyzing the performance of different DHT routing systems subject to random failures. We apply RCM to five DHT systems and obtain analytical expressions that characterize their routability as a continuous function of system size and node failure probability. An important consequence is that in the large-network limit, the routability of certain DHT systems go to zero for any non-zero probability of node failure. These DHT routing algorithms are therefore unscalable, while some others, including Kademlia, which powers the popular eDonkey P2P system, are found to be scalable.

النظم الموزعة والتوازية والحوسبة العنقودية

سجل دخول لتتمكن من نشر تعليقات