ﻻ يوجد ملخص باللغة العربية
This paper describes how 3D XPoint memory arrays can be used as in-memory computing accelerators. We first show that thresholded matrix-vector multiplication (TMVM), the fundamental computational kernel in many applications including machine learning, can be implemented within a 3D XPoint array without requiring data to leave the array for processing. Using the implementation of TMVM, we then discuss the implementation of a binary neural inference engine. We discuss the application of the core concept to address issues such as system scalability, where we connect multiple 3D XPoint arrays, and power integrity, where we analyze the parasitic effects of metal lines on noise margins. To assure power integrity within the 3D XPoint array during this implementation, we carefully analyze the parasitic effects of metal lines on the accuracy of the implementations. We quantify the impact of parasitics on limiting the size and configuration of a 3D XPoint array, and estimate the maximum acceptable size of a 3D XPoint subarray.
Semantic understanding and completion of real world scenes is a foundational primitive of 3D Visual perception widely used in high-level applications such as robotics, medical imaging, autonomous driving and navigation. Due to the curse of dimensiona
Digital In-memory computing improves energy efficiency and throughput of a data-intensive process, which incur memory thrashing and, resulting multiple same memory accesses in a von Neumann architecture. Digital in-memory computing involves accessing
The rapid development of Artificial Intelligence (AI) and Internet of Things (IoT) increases the requirement for edge computing with low power and relatively high processing speed devices. The Computing-In-Memory(CIM) schemes based on emerging resist
Ongoing climate change calls for fast and accurate weather and climate modeling. However, when solving large-scale weather prediction simulations, state-of-the-art CPU and GPU implementations suffer from limited performance and high energy consumptio
Existing FPGA-based DNN accelerators typically fall into two design paradigms. Either they adopt a generic reusable architecture to support different DNN networks but leave some performance and efficiency on the table because of the sacrifice of desi