Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

TerraWatt: Sustaining Sustainable Computing of Containers in Containers

74 0 0.0 ( 0 )

Download Cite

Added by Barath Raghavan

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Jennifer Switzer - Rob McGuinness - Pat Pannuto

Distributed Parallel and Cluster Computing Networking and Internet Architecture

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Each day the world inches closer to a climate catastrophe and a sustainability revolution. To avoid the former and achieve the latter we must transform our use of energy. Surprisingly, todays growing problem is that there is too much wind and solar power generation at the wrong times and in the wrong places. We argue for the construction of TerraWatt: a geographically-distributed, large-scale, zero-carbon compute infrastructure using renewable energy and older hardware. Delivering zero-carbon compute for general cloud workloads is challenging due to spatiotemporal power variability. We describe the systems challenges in using intermittent renewable power at scale to fuel such an older, decentralized compute infrastructure.

rate research

Containers for portable, productive and performant scientific computing

81 - Jack S. Hale , Lizao Li , Chris N. Richardson 2016

Containers are an emerging technology that hold promise for improving productivity and code portability in scientific computing. We examine Linux container technology for the distribution of a non-trivial scientific computing software stack and its execution on a spectrum of platforms from laptop computers through to high performance computing (HPC) systems. We show on a workstation and a leadership-class HPC system that when deployed appropriately there are no performance penalties running scientific programs inside containers. For Python code run on large parallel computers, the run time is reduced inside a container due to faster library imports. The software distribution approach and data that we present will help developers and users decide on whether container technology is appropriate for them. We also provide guidance for the vendors of HPC systems that rely on proprietary libraries for performance on what they can do to make containers work seamlessly and without performance penalty.

Distributed Parallel and Cluster Computing Mathematical Software Software Engineering

Minimizing privilege for building HPC containers

66 - Reid Priedhorsky 2021

HPC centers face increasing demand for software flexibility, and there is growing consensus that Linux containers are a promising solution. However, existing container build solutions require root privileges and cannot be used directly on HPC resources. This limitation is compounded as supercomputer diversity expands and HPC architectures become more dissimilar from commodity computing resources. Our analysis suggests this problem can best be solved with low-privilege containers. We detail relevant Linux kernel features, propose a new taxonomy of container privilege, and compare two open-source implementations: mostly-unprivileged rootless Podman and fully-unprivileged Charliecloud. We demonstrate that low-privilege container build on HPC resources works now and will continue to improve, giving normal users a better workflow to securely and correctly build containers. Minimizing privilege in this way can improve HPC user and developer productivity as well as reduce support workload for exascale applications.

Distributed Parallel and Cluster Computing

Resource Management Schemes for Cloud-Native Platforms with Computing Containers of Docker and Kubernetes

93 - Ying Mao , Yuqi Fu , Suwen Gu 2020

Businesses have made increasing adoption and incorporation of cloud technology into internal processes in the last decade. The cloud-based deployment provides on-demand availability without active management. More recently, the concept of cloud-native application has been proposed and represents an invaluable step toward helping organizations develop software faster and update it more frequently to achieve dramatic business outcomes. Cloud-native is an approach to build and run applications that exploit the cloud computing delivery models advantages. It is more about how applications are created and deployed than where. The container-based virtualization technology, such as Docker and Kubernetes, serves as the foundation for cloud-native applications. This paper investigates the performance of two popular computational-intensive applications, big data, and deep learning, in a cloud-native environment. We analyze the system overhead and resource usage for these applications. Through extensive experiments, we show that the completion time reduces by up to 79.4% by changing the default setting and increases by up to 96.7% due to different resource management schemes on two platforms. Additionally, the resource release is delayed by up to 116.7% across different systems. Our work can guide developers, administrators, and researchers to better design and deploy their applications by selecting and configuring a hosting platform.

Distributed Parallel and Cluster Computing Performance

A Comparative Study of Containers and Virtual Machines in Big Data Environment

181 - Qi Zhang , Ling Liu , Calton Pu 2018

Container technique is gaining increasing attention in recent years and has become an alternative to traditional virtual machines. Some of the primary motivations for the enterprise to adopt the container technology include its convenience to encapsulate and deploy applications, lightweight operations, as well as efficiency and flexibility in resources sharing. However, there still lacks an in-depth and systematic comparison study on how big data applications, such as Spark jobs, perform between a container environment and a virtual machine environment. In this paper, by running various Spark applications with different configurations, we evaluate the two environments from many interesting aspects, such as how convenient the execution environment can be set up, what are makespans of different workloads running in each setup, how efficient the hardware resources, such as CPU and memory, are utilized, and how well each environment can scale. The results show that compared with virtual machines, containers provide a more easy-to-deploy and scalable environment for big data workloads. The research work in this paper can help practitioners and researchers to make more informed decisions on tuning their cloud environment and configuring the big data applications, so as to achieve better performance and higher resources utilization.

Distributed Parallel and Cluster Computing Performance

Differentiate Quality of Experience Scheduling for Deep Learning Applications with Docker Containers in the Cloud

479 - Ying Mao , Weifeng Yan , Yun Song 2020

With the prevalence of big-data-driven applications, such as face recognition on smartphones and tailored recommendations from Google Ads, we are on the road to a lifestyle with significantly more intelligence than ever before. For example, Aipoly Vision [1] is an object and color recognizer that helps the blind, visually impaired, and color blind understand their surroundings. At the back end side of their intelligence, various neural networks powered models are running to enable quick responses to users. Supporting those models requires lots of cloud-based computational resources, e.g. CPUs and GPUs. The cloud providers charge their clients by the amount of resources that they occupied. From clients perspective, they have to balance the budget and quality of experiences (e.g. response time). The budget leans on individual business owners and the required Quality of Experience (QoE) depends on usage scenarios of different applications, for instance, an autonomous vehicle requires realtime response, but, unlocking your smartphone can tolerate delays. However, cloud providers fail to offer a QoE based option to their clients. In this paper, we propose DQoES, a differentiate quality of experience scheduler for deep learning applications. DQoES accepts clients specification on targeted QoEs, and dynamically adjust resources to approach their targets. Through extensive, cloud-based experiments, DQoES demonstrates that it can schedule multiple concurrent jobs with respect to various QoEs and achieve up to 8x times more satisfied models compared to the existing system.

Distributed Parallel and Cluster Computing Performance

comments

Fetching comments

Helwan

Additional details More universities

TerraWatt: Sustaining Sustainable Computing of Containers in Containers

Ask ChatGPT about the research

No Arabic abstract

Read More