Bellamy: Reusing Performance Models for Distributed Dataflow Jobs Across Contexts

published by Dominik Scheinert in 2021 in Informatics Engineering and research's language is English Download

Abstract in English

Distributed dataflow systems enable the use of clusters for scalable data analytics. However, selecting appropriate cluster resources for a processing job is often not straightforward. Performance models trained on historical executions of a concrete job are helpful in such situations, yet they are usually bound to a specific job execution context (e.g. node type, softwar

Download