ﻻ يوجد ملخص باللغة العربية
This article summarizes the idea of refresh-access parallelism, which was published in HPCA 2014, and examines the works significance and future potential. The overarching objective of our HPCA 2014 paper is to reduce the significant negative performance impact of DRAM refresh with intelligent memory controller mechanisms. To mitigate the negative performance impact of DRAM refresh, our HPCA 2014 paper proposes two complementary mechanisms, DARP (Dynamic Access Refresh Parallelization) and SARP (Subarray Access Refresh Parallelization). The goal is to address the drawbacks of state-of-the-art per-bank refresh mechanism by building more efficient techniques to parallelize refreshes and accesses within DRAM. First, instead of issuing per-bank refreshes in a round-robin order, as it is done today, DARP issues per-bank refreshes to idle banks in an out-of-order manner. Furthermore, DARP proactively schedules refreshes during intervals when a batch of writes are draining to DRAM. Second, SARP exploits the existence of mostly-independent subarrays within a bank. With minor modifications to DRAM organization, it allows a bank to serve memory accesses to an idle subarray while another subarray is being refreshed. Our extensive evaluations on a wide variety of workloads and systems show that our mechanisms improve system performance (and energy efficiency) compared to three state-of-the-art refresh policies, and their performance bene ts increase as DRAM density increases.
DRAM-based memory is a critical factor that creates a bottleneck on the system performance since the processor speed largely outperforms the DRAM latency. In this thesis, we develop a low-cost mechanism, called ChargeCache, which enables faster acces
To employ a Convolutional Neural Network (CNN) in an energy-constrained embedded system, it is critical for the CNN implementation to be highly energy efficient. Many recent studies propose CNN accelerator architectures with custom computation units
This paper summarizes the idea of Subarray-Level Parallelism (SALP) in DRAM, which was published in ISCA 2012, and examines the works significance and future potential. Modern DRAMs have multiple banks to serve multiple memory requests in parallel. H
In this work, we propose FUSE, a novel GPU cache system that integrates spin-transfer torque magnetic random-access memory (STT-MRAM) into the on-chip L1D cache. FUSE can minimize the number of outgoing memory accesses over the interconnection networ
After a plethora of high-profile RowHammer attacks, CPU and DRAM vendors scrambled to deliver what was meant to be the definitive hardware solution against the RowHammer problem: Target Row Refresh (TRR). A common belief among practitioners is that,