ﻻ يوجد ملخص باللغة العربية
As memory capacity has outstripped TLB coverage, large data applications suffer from frequent page table walks. We investigate two complementary techniques for addressing this cost: reducing the number of accesses required and reducing the latency of each access. The first approach is accomplished by opportunistically flattening the page table: merging two levels of traditional 4KB page table nodes into a single 2MB node, thereby reducing the tables depth and the number of indirections required to search it. The second is accomplished by biasing the cache replacement algorithm to keep page table entries during periods of high TLB miss rates, as these periods also see high data miss rates and are therefore more likely to benefit from having the smaller page table in the cache than to suffer from increased data cache misses. We evaluate these approaches for both native and virtualized systems and across a range of realistic memory fragmentation scenarios, describe the limited changes needed in our kernel implementation and hardware design, identify and address challenges related to self-referencing page tables and kernel memory allocation, and compare results across server and mobile systems using both academic and industrial simulators for robustness. We find that flattening does reduce the number of accesses required on a page walk (to 1.0), but its performance impact (+2.3%) is small due to Page Walker Caches (already 1.5 accesses). Prioritizing caching has a larger effect (+6.8%), and the combination improves performance by +9.2%. Flattening is more effective on virtualized systems (4.4 to 2.8 accesses, +7.1% performance), due to 2D page walks. By combining the two techniques we demonstrate a state-of-the-art +14.0% performance gain and -8.7% dynamic cache energy and -4.7% dynamic DRAM energy for virtualized execution with very simple hardware and software changes.
Asymptotic Causal Diamonds (ACDs) are a natural flat space analogue of AdS causal wedges, and it has been argued previously that they may be useful for understanding bulk locality in flat space holography. In this paper, we use ACD-inspired ideas to
Hybrid memory systems comprised of dynamic random access memory (DRAM) and non-volatile memory (NVM) have been proposed to exploit both the capacity advantage of NVM and the latency and dynamic energy advantages of DRAM. An important problem for such
Data movement between main memory and the processor is a significant contributor to the execution time and energy consumption of memory-intensive applications. This data movement bottleneck can be alleviated using Processing-in-Memory (PiM), which en
Rowhammer attacks that corrupt level-1 page tables to gain kernel privilege are the most detrimental to system security and hard to mitigate. However, recently proposed software-only mitigations are not effective against such kernel privilege escalat
Quantum interference on the kagome lattice generates electronic bands with narrow bandwidth, called flat bands. Crystal structures incorporating this lattice can host strong electron correlations with non-standard ingredients, but only if these bands