ﻻ يوجد ملخص باللغة العربية
In modern server CPUs, last-level cache (LLC) is a critical hardware resource that exerts significant influence on the performance of the workloads, and how to manage LLC is a key to the performance isolation and QoS in the cloud with multi-tenancy. In this paper, we argue that besides CPU cores, high-speed network I/O is also important for LLC management. This is because of an Intel architectural innovation -- Data Direct I/O (DDIO) -- that directly injects the inbound I/O traffic to (part of) the LLC instead of the main memory. We summarize two problems caused by DDIO and show that (1) the default DDIO configuration may not always achieve optimal performance, (2) DDIO can decrease the performance of non-I/O workloads which share LLC with it by as high as 32%. We then present IOCA, the first LLC management mechanism for network-centric platforms that treats the I/O as the first-class citizen. IOCA monitors and analyzes the performance of the cores, LLC, and DDIO using CPUs hardware performance counters, and adaptively adjusts the number of LLC ways for DDIO or the tenants that demand more LLC capacity. In addition, IOCA dynamically chooses the tenants that share its LLC resource with DDIO, to minimize the performance interference by both the tenants and the I/O. Our experiments with multiple microbenchmarks and real-world applications in two major end-host network models demonstrate that IOCA can effectively reduce the performance degradation caused by DDIO, with minimal overhead.
The current mobile applications have rapidly growing memory footprints, posing a great challenge for memory system design. Insufficient DRAM main memory will incur frequent data swaps between memory and storage, a process that hurts performance, cons
Blockchain has attracted a broad range of interests from start-ups, enterprises and governments to build next generation applications in a decentralized manner. Similar to cloud platforms, a single blockchain-based system may need to serve multiple t
Deep learning (DL) is becoming increasingly popular in several application domains and has made several new application features involving computer vision, speech recognition and synthesis, self-driving automobiles, drug design, etc. feasible and acc
There is an explosive growth in the size of the input and/or intermediate data used and generated by modern and emerging applications. Unfortunately, modern computing systems are not capable of handling large amounts of data efficiently. Major concep
Cutting-edge embedded system applications, such as self-driving cars and unmanned drone software, are reliant on integrated CPU/GPU platforms for their DNNs-driven workload, such as perception and other highly parallel components. In this work, we se