ACTS in Need: Automatic Configuration Tuning with Scalability Guarantees

114 0 0.0 ( 0 )

Download Cite

Added by Yuqing Zhu

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Yuqing Zhu - Jianxun Liu - Mengying Guo

Distributed Parallel and Cluster Computing

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

To support the variety of Big Data use cases, many Big Data related systems expose a large number of user-specifiable configuration parameters. Highlighted in our experiments, a MySQL deployment with well-tuned configuration parameters achieves a peak throughput as 12 times much as one with the default setting. However, finding the best setting for the tens or hundreds of configuration parameters is mission impossible for ordinary users. Worse still, many Big Data applications require the support of multiple systems co-deployed in the same cluster. As these co-deployed systems can interact to affect the overall performance, they must be tuned together. Automatic configuration tuning with scalability guarantees (ACTS) is in need to help system users. Solutions to ACTS must scale to various systems, workloads, deployments, parameters and resource limits. Proposing and implementing an ACTS solution, we demonstrate that ACTS can benefit users not only in improving system performance and resource utilization, but also in saving costs and enabling fairer benchmarking.

rate research

Trevor: Automatic configuration and scaling of stream processing pipelines

272 - Manu Bansal , Eyal Cidon , Arjun Balasingam 2018

Operating a distributed data stream processing workload efficiently at scale is hard. The operator of the workload must parallelize and lay out tasks of the workload with resources that match the requirement of target data rate. The challenge is that neither the operator nor the programmer is typically aware of the scaling behavior of the workload as a function of resources. An operator manually searches for a safe operating point that can handle predicted peak load and deploys with ample headroom for absorbing unpredictable spikes. Such empirical, static over-provisioning is wasteful of both compute and human resources. We show that precise performance models can be automatically learned for distributed stream processing systems that can predict the execution performance of a job even before deployment. Further, those models can be used to optimally schedule logically specified jobs onto available physical hardware. Finally, those models and the derived execution schedules can be refined online to dynamically adapt to unpredictable changes in the runtime environment or auto-scale with variations in job load.

Distributed Parallel and Cluster Computing

BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning

77 - Yuqing Zhu , Jianxun Liu , Mengying Guo 2017

An ever increasing number of configuration parameters are provided to system users. But many users have used one configuration setting across different workloads, leaving untapped the performance potential of systems. A good configuration setting can greatly improve the performance of a deployed system under certain workloads. But with tens or hundreds of parameters, it becomes a highly costly task to decide which configuration setting leads to the best performance. While such task requires the strong expertise in both the system and the application, users commonly lack such expertise. To help users tap the performance potential of systems, we present BestConfig, a system for automatically finding a best configuration setting within a resource limit for a deployed system under a given application workload. BestConfig is designed with an extensible architecture to automate the configuration tuning for general systems. To tune system configurations within a resource limit, we propose the divide-and-diverge sampling method and the recursive bound-and-search algorithm. BestConfig can improve the throughput of Tomcat by 75%, that of Cassandra by 63%, that of MySQL by 430%, and reduce the running time of Hive join job by about 50% and that of Spark join job by about 80%, solely by configuration adjustment.

Performance Databases Distributed Parallel and Cluster Computing

A Collaborative Filtering Approach for the Automatic Tuning of Compiler Optimisations

104 - Stefano Cereda , Gianluca Palermo , Paolo Cremonesi 2020

Selecting the right compiler optimisations has a severe impact on programs performance. Still, the available optimisations keep increasing, and their effect depends on the specific program, making the task human intractable. Researchers proposed several techniques to search in the space of compiler optimisations. Some approaches focus on finding better search algorithms, while others try to speed up the search by leveraging previously collected knowledge. The possibility to effectively reuse previous compilation results inspired us toward the investigation of techniques derived from the Recommender Systems field. The proposed approach exploits previously collected knowledge and improves its characterisation over time. Differently from current state-of-the-art solutions, our approach is not based on performance counters but relies on Reaction Matching, an algorithm able to characterise programs looking at how they react to different optimisation sets. The proposed approach has been validated using two widely used benchmark suites, cBench and PolyBench, including 54 different programs. Our solution, on average, extracted 90% of the available performance improvement 10 iterations before current state-of-the-art solutions, which corresponds to 40% fewer compilations and performance tests to perform.

Distributed Parallel and Cluster Computing Performance

Cross-Chain Payment Protocols with Success Guarantees

372 - Rob van Glabbeek , Vincent Gramoli , Pierre Tholoniat 2019

In this paper, we consider the problem of cross-chain payment whereby customers of different escrows -- implemented by a bank or a blockchain smart contract -- successfully transfer digital assets without trusting each other. Prior to this work, cross-chain payment problems did not require this success or any form of progress. We introduce a new specification formalism called Asynchronous Networks of Timed Automata (ANTA) to formalise such protocols. We present the first cross-chain payment protocol that ensures termination in a bounded amount of time and works correctly in the presence of clock skew. We then demonstrate that it is impossible to solve this problem without assuming synchrony, in the sense that each message is guaranteed to arrive within a known amount of time. We also offer a protocol that solves an eventually terminating variant of this cross-chain payment problem without synchrony, and even in the presence of Byzantine failures.

Distributed Parallel and Cluster Computing Logic in Computer Science

Service Placement with Provable Guarantees in Heterogeneous Edge Computing Systems

75 - Stephen Pasteris , Shiqiang Wang , Mark Herbster 2019

Mobile edge computing (MEC) is a promising technique for providing low-latency access to services at the network edge. The services are hosted at various types of edge nodes with both computation and communication capabilities. Due to the heterogeneity of edge node characteristics and user locations, the performance of MEC varies depending on where the service is hosted. In this paper, we consider such a heterogeneous MEC system, and focus on the problem of placing multiple services in the system to maximize the total reward. We show that the problem is NP-hard via reduction from the set cover problem, and propose a deterministic approximation algorithm to solve the problem, which has an approximation ratio that is not worse than $left(1-e^{-1}right)/4$. The proposed algorithm is based on two sub-routines that are suitable for small and arbitrarily sized services, respectively. The algorithm is designed using a novel way of partitioning each edge node into multiple slots, where each slot contains one service. The approximation guarantee is obtained via a specialization of the method of conditional expectations, which uses a randomized procedure as an intermediate step. In addition to theoretical guarantees, simulation results also show that the proposed algorithm outperforms other state-of-the-art approaches.

Distributed Parallel and Cluster Computing Optimization and Control