Do you want to publish a course? Click here

Using Genetic Algorithms to Benchmark the Cloud

64   0   0.0 ( 0 )
 Added by Sekou Remy
 Publication date 2015
and research's language is English




Ask ChatGPT about the research

This paper presents a novel application of Genetic Algorithms(GAs) to quantify the performance of Platform as a Service (PaaS), a cloud service model that plays a critical role in both industry and academia. While Cloud benchmarks are not new, in this novel concept, the authors use a GA to take advantage of the elasticity in Cloud services in a graceful manner that was not previously possible. Using Google App Engine, Heroku, and Python Anywhere with three distinct classes of client computers running our GA codebase, we quantified the completion time for application of the GA to search for the parameters of controllers for dynamical systems. Our results show statistically significant differences in PaaS performance by vendor, and also that the performance of the PaaS performance is dependent upon the client that uses it. Results also show the effectiveness of our GA in determining the level of service of PaaS providers, and for determining if the level of service of one PaaS vendor is repeatable with another. Such a concept could then increase the appeal of PaaS Cloud services by making them more financially appealing.



rate research

Read More

Distributed digital infrastructures for computation and analytics are now evolving towards an interconnected ecosystem allowing complex applications to be executed from IoT Edge devices to the HPC Cloud (aka the Computing Continuum, the Digital Continuum, or the Transcontinuum). Understanding end-to-end performance in such a complex continuum is challenging. This breaks down to reconciling many, typically contradicting application requirements and constraints with low-level infrastructure design choices. One important challenge is to accurately reproduce relevant behaviors of a given application workflow and representative settings of the physical infrastructure underlying this complex continuum. We introduce a rigorous methodology for such a process and validate it through E2Clab. It is the first platform to support the complete experimental cycle across the Computing Continuum: deployment, analysis, optimization. Preliminary results with real-life use cases show that E2Clab allows one to understand and improve performance, by correlating it to the parameter settings, the resource usage and the specifics of the underlying infrastructure.
Several fundamental changes in technology indicate domain-specific hardware and software co-design is the only path left. In this context, architecture, system, data management, and machine learning communities pay greater attention to innovative big data and AI algorithms, architecture, and systems. Unfortunately, complexity, diversity, frequently-changed workloads, and rapid evolution of big data and AI systems raise great challenges. First, the traditional benchmarking methodology that creates a new benchmark or proxy for every possible workload is not scalable, or even impossible for Big Data and AI benchmarking. Second, it is prohibitively expensive to tailor the architecture to characteristics of one or more application or even a domain of applications. We consider each big data and AI workload as a pipeline of one or more classes of units of computation performed on different initial or intermediate data inputs, each class of which we call a data motif. On the basis of our previous work that identifies eight data motifs taking up most of the run time of a wide variety of big data and AI workloads, we propose a scalable benchmarking methodology that uses the combination of one or more data motifs---to represent diversity of big data and AI workloads. Following this methodology, we present a unified big data and AI benchmark suite---BigDataBench 4.0, publicly available from~url{http://prof.ict.ac.cn/BigDataBench}. This unified benchmark suite sheds new light on domain-specific hardware and software co-design: tailoring the system and architecture to characteristics of the unified eight data motifs other than one or more application case by case. Also, for the first time, we comprehensively characterize the CPU pipeline efficiency using the benchmarks of seven workload types in BigDataBench 4.0.
Combinatorial algorithms such as those that arise in graph analysis, modeling of discrete systems, bioinformatics, and chemistry, are often hard to parallelize. The Combinatorial BLAS library implements key computational primitives for rapid development of combinatorial algorithms in distributed-memory systems. During the decade since its first introduction, the Combinatorial BLAS library has evolved and expanded significantly. This paper details many of the key technical features of Combinatorial BLAS version 2.0, such as communication avoidance, hierarchical parallelism via in-node multithreading, accelerator support via GPU kernels, generalized semiring support, implementations of key data structures and functions, and scalable distributed I/O operations for human-readable files. Our paper also presents several rules of thumb for choosing the right data structures and functions in Combinatorial BLAS 2.0, under various common application scenarios.
A memory leak in an application deployed on the cloud can affect the availability and reliability of the application. Therefore, identifying and ultimately resolve it quickly is highly important. However, in the production environment running on the cloud, memory leak detection is a challenge without the knowledge of the application or its internal object allocation details. This paper addresses this challenge of detection of memory leaks in cloud-based infrastructure without having any internal knowledge by introducing two novel machine learning-based algorithms: Linear Backward Regression (LBR) and Precog and, their two variants: Linear Backward Regression with Change Points Detection (LBRCPD) and Precog with Maximum Filteration (PrecogMF). These algorithms only use one metric i.e the systems memory utilization on which the application is deployed for detection of a memory leak. The developed algorithms accuracy was tested on 60 virtual machines manually labeled memory utilization data and it was found that the proposed PrecogMF algorithm achieves the highest accuracy score of 85%. The same algorithm also achieves this by decreasing the overall compute time by 80% when compared to LBRs compute time. The paper also presents the different memory leak patterns found in the various memory leak applications and are further classified into different classes based on their visual representation.
Cloud GPU servers have become the de facto way for deep learning practitioners to train complex models on large-scale datasets. However, it is challenging to determine the appropriate cluster configuration---e.g., server type and number---for different training workloads while balancing the trade-offs in training time, cost, and model accuracy. Adding to the complexity is the potential to reduce the monetary cost by using cheaper, but revocable, transient GPU servers. In this work, we analyze distributed training performance under diverse cluster configurations using CM-DARE, a cloud-based measurement and training framework. Our empirical datasets include measurements from three GPU types, six geographic regions, twenty convolutional neural networks, and thousands of Google Cloud servers. We also demonstrate the feasibility of predicting training speed and overhead using regression-based models. Finally, we discuss potential use cases of our performance modeling such as detecting and mitigating performance bottlenecks.
comments
Fetching comments Fetching comments
Sign in to be able to follow your search criteria
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا