Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Adaptive parallelism with RMI: Idle high-performance computing resources can be completely avoided

507 0 0.0 ( 0 )

Download Cite

Added by Bernd Hartke

Publication date 2018

fields Informatics Engineering Physics

and research's language is English

Authors Florian Spenke - Karsten Balzer - Sascha Frick

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In practice, standard scheduling of parallel computing jobs almost always leaves significant portions of the available hardware unused, even with many jobs still waiting in the queue. The simple reason is that the resource requests of these waiting jobs are fixed and do not match the available, unused resources. However, with alternative but existing and well-established techniques it is possible to achieve a fully automated, adaptive parallelism that does not need pre-set, fixed resources. Here, we demonstrate that such an adaptively parallel program can indeed fill in all such scheduling gaps, even in real-life situations on large supercomputers.

rate research

Harvesting Idle Resources in Serverless Computing via Reinforcement Learning

430 - Hanfei Yu , Hao Wang , Jian Li 2021

Serverless computing has become a new cloud computing paradigm that promises to deliver high cost-efficiency and simplified cloud deployment with automated resource scaling at a fine granularity. Users decouple a cloud application into chained functions and preset each serverless functions memory and CPU demands at megabyte-level and core-level, respectively. Serverless platforms then automatically scale the number of functions to accommodate the workloads. However, the complexities of chained functions make it non-trivial to accurately determine the resource demands of each function for users, leading to either resource over-provision or under-provision for individual functions. This paper presents FaaSRM, a new resource manager (RM) for serverless platforms that maximizes resource efficiency by dynamically harvesting idle resources from functions over-supplied to functions under-supplied. FaaSRM monitors each functions resource utilization in real-time, detects over-provisioning and under-provisioning, and applies deep reinforcement learning to harvest idle resources safely using a safeguard mechanism and accelerate functions efficiently. We have implemented and deployed a FaaSRM prototype in a 13-node Apache OpenWhisk cluster. Experimental results on the OpenWhisk cluster show that FaaSRM reduces the execution time of 98% of function invocations by 35.81% compared to the baseline RMs by harvesting idle resources from 38.8% of the invocations and accelerating 39.2% of the invocations.

Distributed Parallel and Cluster Computing Machine Learning

Supporting High-Performance and High-Throughput Computing for Experimental Science

93 - E. A. Huerta , Roland Haas , Shantenu Jha 2018

The advent of experimental science facilities-instruments and observatories, such as the Large Hadron Collider, the Laser Interferometer Gravitational Wave Observatory, and the upcoming Large Synoptic Survey Telescope-has brought about challenging, large-scale computational and data processing requirements. Traditionally, the computing infrastructure to support these facilitys requirements were organized into separate infrastructure that supported their high-throughput needs and those that supported their high-performance computing needs. We argue that to enable and accelerate scientific discovery at the scale and sophistication that is now needed, this separation between high-performance computing and high-throughput computing must be bridged and an integrated, unified infrastructure provided. In this paper, we discuss several case studies where such infrastructure has been implemented. These case studies span different science domains, software systems, and application requirements as well as levels of sustainability. A further aim of this paper is to provide a basis to determine the common characteristics and requirements of such infrastructure, as well as to begin a discussion of how best to support the computing requirements of existing and future experimental science facilities.

Distributed Parallel and Cluster Computing High Energy Astrophysical Phenomena General Relativity and Quantum Cosmology

The ANTAREX Domain Specific Language for High Performance Computing

75 - Cristina Silvano , Giovanni Agosta , Andrea Bartolini 2019

The ANTAREX project relies on a Domain Specific Language (DSL) based on Aspect Oriented Programming (AOP) concepts to allow applications to enforce extra functional properties such as energy-efficiency and performance and to optimize Quality of Service (QoS) in an adaptive way. The DSL approach allows the definition of energy-efficiency, performance, and adaptivity strategies as well as their enforcement at runtime through application autotuning and resource and power management. In this paper, we present an overview of the key outcome of the project, the ANTAREX DSL, and some of its capabilities through a number of examples, including how the DSL is applied in the context of the project use cases.

Distributed Parallel and Cluster Computing

A high-performance interactive computing framework for engineering applications

66 - Jovana Knev{z}evic n Technische Universitat Munchen 2018

To harness the potential of advanced computing technologies, efficient (real time) analysis of large amounts of data is as essential as are front-line simulations. In order to optimise this process, experts need to be supported by appropriate tools that allow to interactively guide both the computation and data exploration of the underlying simulation code. The main challenge is to seamlessly feed the user requirements back into the simulation. State-of-the-art attempts to achieve this, have resulted in the insertion of so-called check- and break-points at fixed places in the code. Depending on the size of the problem, this can still compromise the benefits of such an attempt, thus, preventing the experience of real interactive computing. To leverage the concept for a broader scope of applications, it is essential that a user receives an immediate response from the simulation to his or her changes. Our generic integration framework, targeted to the needs of the computational engineering domain, supports distributed computations as well as on-the-fly visualisation in order to reduce latency and enable a high degree of interactivity with only minor code modifications. Namely, the regular course of the simulation coupled to our framework is interrupted in small, cyclic intervals followed by a check for updates. When new data is received, the simulation restarts automatically with the updated settings (boundary conditions, simulation parameters, etc.). To obtain rapid, albeit approximate feedback from the simulation in case of perpetual user interaction, a multi-hierarchical approach is advantageous. Within several different engineering test cases, we will demonstrate the flexibility and the effectiveness of our approach.

Distributed Parallel and Cluster Computing

RFaaS: RDMA-Enabled FaaS Platform for Serverless High-Performance Computing

78 - Marcin Copik , Konstantin Taranov , Alexandru Calotoiu 2021

The rigid MPI programming model and batch scheduling dominate high-performance computing. While clouds brought new levels of elasticity into the world of computing, supercomputers still suffer from low resource utilization rates. To enhance supercomputing clusters with the benefits of serverless computing, a modern cloud programming paradigm for pay-as-you-go execution of stateless functions, we present rFaaS, the first RDMA-aware Function-as-a-Service (FaaS) platform. With hot invocations and decentralized function placement, we overcome the major performance limitations of FaaS systems and provide low-latency remote invocations in multi-tenant environments. We evaluate the new serverless system through a series of microbenchmarks and show that remote functions execute with negligible performance overheads. We demonstrate how serverless computing can bring elastic resource management into MPI-based high-performance applications. Overall, our results show that MPI applications can benefit from modern cloud programming paradigms to guarantee high performance at lower resource costs.

Distributed Parallel and Cluster Computing

comments

Fetching comments

Syrian International University for Science and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Adaptive parallelism with RMI: Idle high-performance computing resources can be completely avoided

Ask ChatGPT about the research

No Arabic abstract

Read More