ﻻ يوجد ملخص باللغة العربية
Systems for processing big data---e.g., Hadoop, Spark, and massively parallel databases---need to run workloads on behalf of multiple tenants simultaneously. The abundant disk-based storage in these systems is usually complemented by a smaller, but much faster, {em cache}. Cache is a precious resource: Tenants who get to use cache can see two orders of magnitude performance improvement. Cache is also a limited and hence shared resource: Unlike a resource like a CPU core which can be used by only one tenant at a time, a cached data item can be accessed by multiple tenants at the same time. Cache, therefore, has to be shared by a multi-tenancy-aware policy across tenants, each having a unique set of priorities and workload characteristics. In this paper, we develop cache allocation strategies that speed up the overall workload while being {em fair} to each tenant. We build a novel fairness model targeted at the shared resource setting that incorporates not only the more standard concepts of Pareto-efficiency and sharing incentive, but also define envy freeness via the notion of {em core} from cooperative game theory. Our cache management platform, ROBUS, uses randomization over small time batches, and we develop a proportionally fair allocation mechanism that satisfies the core property in expectation. We show that this algorithm and related fair algorithms can be approximated to arbitrary precision in polynomial time. We evaluate these algorithms on a ROBUS prototype implemented on Spark with RDD store used as cache. Our evaluation on a synthetically generated industry-standard workload shows that our algorithms provide a speedup close to performance optimal algorithms while guaranteeing fairness across tenants.
With widespread advances in machine learning, a number of large enterprises are beginning to incorporate machine learning models across a number of products. These models are typically trained on shared, multi-tenant GPU clusters. Similar to existing
The in-memory cache system is an important component in a cloud for the data access performance. As the tenants may have different performance goals for data access depending on the nature of their tasks, effectively managing the memory cache is a cr
Web application performance is heavily reliant on the hit rate of memory-based caches. Current DRAM-based web caches statically partition their memory across multiple applications sharing the cache. This causes under utilization of memory which negat
Container technologies have been evolving rapidly in the cloud-native era. Kubernetes, as a production-grade container orchestration platform, has been proven to be successful at managing containerized applications in on-premises datacenters. However
In recent years, machine intelligence (MI) applications have emerged as a major driver for the computing industry. Optimizing these workloads is important but complicated. As memory demands grow and data movement overheads increasingly limit performa