ترغب بنشر مسار تعليمي؟ اضغط هنا

Benchmarking Physical Performance of Neural Inference Circuits

136   0   0.0 ( 0 )
 نشر من قبل Dmitri Nikonov
 تاريخ النشر 2019
والبحث باللغة English




اسأل ChatGPT حول البحث

Numerous neural network circuits and architectures are presently under active research for application to artificial intelligence and machine learning. Their physical performance metrics (area, time, energy) are estimated. Various types of neural networks (artificial, cellular, spiking, and oscillator) are implemented with multiple CMOS and beyond-CMOS (spintronic, ferroelectric, resistive memory) devices. A consistent and transparent methodology is proposed and used to benchmark this comprehensive set of options across several application cases. Promising architecture/device combinations are identified.

قيم البحث

اقرأ أيضاً

We report the performance characteristics of a notional Convolutional Neural Network based on the previously-proposed Multiply-Accumulate-Activate-Pool set, an MTJ-based spintronic circuit made to compute multiple neural functionalities in parallel. A study of image classification with the MNIST handwritten digits dataset using this network is provided via simulation. The effect of changing the weight representation precision, the severity of device process variation within the MAAP sets and the computational redundancy are provided. The emulated network achieves between 90 and 95% image classification accuracy at a cost of ~100 nJ per image.
We present a DevIce-to-System Performance EvaLuation (DISPEL) workflow that integrates transistor and interconnect modeling, parasitic extraction, standard cell library characterization, logic synthesis, cell placement and routing, and timing analysi s to evaluate system-level performance of new CMOS technologies. As the impact of parasitic resistances and capacitances continues to increase with dimensional downscaling, component-level optimization alone becomes insufficient, calling for a holistic assessment and optimization methodology across the boundaries between devices, interconnects, circuits, and systems. The physical implementation flow in DISPEL enables realistic analysis of complex wires and vias in VLSI systems and their impact on the chip power, speed, and area, which simple circuit simulations cannot capture. To demonstrate the use of DISPEL, a 32-bit commercial processor core is implemented using theoretical n-type MoS2 and p-type Black Phosphorous (BP) planar FETs at a projected 5-nm node, and the performance is benchmarked against Si FinFETs. While the superior gate control of the MoS2/BP FETs can theoretically provide 51% reduction in the iso-frequency energy consumption, the actual performance can be greatly limited by the source/drain contact resistances. With the large amount of data generated by DISPEL, a neural-network is trained to predict the key performance metrics of the 32-bit processor core using the characteristics of transistors and interconnects as the input features without the need to go through the time-consuming physical implementation flow. The machine learning algorithms show great potentials as a means for evaluation and optimization of new CMOS technologies and identifying the most significant technology design parameters.
Spintronic nanodevices have ultrafast nonlinear dynamic and recurrence behaviors on a nanosecond scale that promises to enable spintronic reservoir computing (RC) system. Here two physical RC systems based on a single magnetic skyrmion memristor (MSM ) and 24 spin-torque nano-oscillators (STNOs) were proposed and modeled to process image classification task and nonlinear dynamic system prediction, respectively. Based on our micromagnetic simulation results on the nonlinear responses of MSM and STNO with current pulses stimulation, the handwritten digits recognition task domesticates that an RC system using one single MSM has the outstanding performance on image classification. In addition, the complex unknown nonlinear dynamic problems can also be well solved by a physical RC system consisted of 24 STNOs confirmed in a second-order nonlinear dynamic system and NARMA10 tasks. The capability of both high accuracy and fast information processing promises to enable one type of brain-like chip based on spintronics for various artificial intelligence tasks.
The explosive growth of data and its related energy consumption is pushing the need to develop energy-efficient brain-inspired schemes and materials for data processing and storage. Here, we demonstrate experimentally that Co/Pt films can be used as artificial synapses by manipulating their magnetization state using circularly-polarized ultrashort optical pulses at room temperature. We also show an efficient implementation of supervised perceptron learning on an opto-magnetic neural network, built from such magnetic synapses. Importantly, we demonstrate that the optimization of synaptic weights can be achieved using a global feedback mechanism, such that the learning does not rely on external storage or additional optimization schemes. These results suggest there is high potential for realizing artificial neural networks using optically-controlled magnetization in technologically relevant materials, that can learn not only fast but also energy-efficient.
Analog hardware implemented deep learning models are promising for computation and energy constrained systems such as edge computing devices. However, the analog nature of the device and the associated many noise sources will cause changes to the val ue of the weights in the trained deep learning models deployed on such devices. In this study, systematic evaluation of the inference performance of trained popular deep learning models for image classification deployed on analog devices has been carried out, where additive white Gaussian noise has been added to the weights of the trained models during inference. It is observed that deeper models and models with more redundancy in design such as VGG are more robust to the noise in general. However, the performance is also affected by the design philosophy of the model, the detailed structure of the model, the exact machine learning task, as well as the datasets.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا