ﻻ يوجد ملخص باللغة العربية
It has become increasingly difficult to understand the complex interaction between modern applications and main memory, composed of DRAM chips. Manufacturers are now selling and proposing many different types of DRAM, with each DRAM type catering to different needs (e.g., high throughput, low power, high memory density). At the same time, the memory access patterns of prevalent and emerging workloads are rapidly diverging, as these applications manipulate larger data sets in very different ways. As a result, the combined DRAM-workload behavior is often difficult to intuitively determine today, which can hinder memory optimizations in both hardware and software. In this work, we identify important families of workloads, as well as prevalent types of DRAM chips, and rigorously analyze the combined DRAM--workload behavior. To this end, we perform a comprehensive experimental study of the interaction between nine different DRAM types and 115 modern applications and multiprogrammed workloads. We draw 12 key observations from our characterization, enabled in part by our development of new metrics that take into account contention between memory requests due to hardware design. Notably, we find that (1) newer DRAM types such as DDR4 and HMC often do not outperform older types such as DDR3, due to higher access latencies and, in the case of HMC, poor exploitation of locality; (2) there is no single DRAM type that can cater to all components of a heterogeneous system (e.g., GDDR5 significantly outperforms other memories for multimedia acceleration, while HMC significantly outperforms other memories for network acceleration); and (3) there is still a strong need to lower DRAM latency, but unfortunately the current design trend of commodity DRAM is toward higher latencies to obtain other benefits. We hope that the trends we identify can drive optimizations in both hardware and software design.
Over the past two decades, the storage capacity and access bandwidth of main memory have improved tremendously, by 128x and 20x, respectively. These improvements are mainly due to the continuous technology scaling of DRAM (dynamic random-access memor
The complexity and diversity of big data and AI workloads make understanding them difficult and challenging. This paper proposes a new approach to modelling and characterizing big data and AI workloads. We consider each big data and AI workload as a
In order to shed more light on how RowHammer affects modern and future devices at the circuit-level, we first present an experimental characterization of RowHammer on 1580 DRAM chips (408x DDR3, 652x DDR4, and 520x LPDDR4) from 300 DRAM modules (60x
This paper summarizes our work on experimental characterization and analysis of reduced-voltage operation in modern DRAM chips, which was published in SIGMETRICS 2017, and examines the works significance and future potential. We take a comprehensiv
DRAM is the prevalent main memory technology, but its long access latency can limit the performance of many workloads. Although prior works provide DRAM designs that reduce DRAM access latency, their reduced storage capacities hinder the performance