ترغب بنشر مسار تعليمي؟ اضغط هنا

Multiplier-less Artificial Neurons Exploiting Error Resiliency for Energy-Efficient Neural Computing

54   0   0.0 ( 0 )
 نشر من قبل Syed Shakib Sarwar
 تاريخ النشر 2016
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Large-scale artificial neural networks have shown significant promise in addressing a wide range of classification and recognition applications. However, their large computational requirements stretch the capabilities of computing platforms. The fundamental components of these neural networks are the neurons and its synapses. The core of a digital hardware neuron consists of multiplier, accumulator and activation function. Multipliers consume most of the processing energy in the digital neurons, and thereby in the hardware implementations of artificial neural networks. We propose an approximate multiplier that utilizes the notion of computation sharing and exploits error resilience of neural network applications to achieve improved energy consumption. We also propose Multiplier-less Artificial Neuron (MAN) for even larger improvement in energy consumption and adapt the training process to ensure minimal degradation in accuracy. We evaluated the proposed design on 5 recognition applications. The results show, 35% and 60% reduction in energy consumption, for neuron sizes of 8 bits and 12 bits, respectively, with a maximum of ~2.83% loss in network accuracy, compared to a conventional neuron implementation. We also achieve 37% and 62% reduction in area for a neuron size of 8 bits and 12 bits, respectively, under iso-speed conditions.



قيم البحث

اقرأ أيضاً

Neuromorphic computing, inspired by the brain, promises extreme efficiency for certain classes of learning tasks, such as classification and pattern recognition. The performance and power consumption of neuromorphic computing depends heavily on the c hoice of the neuron architecture. Digital neurons (Dig-N) are conventionally known to be accurate and efficient at high speed, while suffering from high leakage currents from a large number of transistors in a large design. On the other hand, analog/mixed-signal neurons are prone to noise, variability and mismatch, but can lead to extremely low-power designs. In this work, we will analyze, compare and contrast existing neuron architectures with a proposed mixed-signal neuron (MS-N) in terms of performance, power and noise, thereby demonstrating the applicability of the proposed mixed-signal neuron for achieving extreme energy-efficiency in neuromorphic computing. The proposed MS-N is implemented in 65 nm CMOS technology and exhibits > 100X better energy-efficiency across all frequencies over two traditional digital neurons synthesized in the same technology node. We also demonstrate that the inherent error-resiliency of a fully connected or even convolutional neural network (CNN) can handle the noise as well as the manufacturing non-idealities of the MS-N up to certain degrees. Notably, a system-level implementation on MNIST datasets exhibits a worst-case increase in classification error by 2.1% when the integrated noise power in the bandwidth is ~ 0.1 uV2, along with +-3{sigma} amount of variation and mismatch introduced in the transistor parameters for the proposed neuron with 8-bit precision.
The spiking neural network (SNN) computes and communicates information through discrete binary events. It is considered more biologically plausible and more energy-efficient than artificial neural networks (ANN) in emerging neuromorphic hardware. How ever, due to the discontinuous and non-differentiable characteristics, training SNN is a relatively challenging task. Recent work has achieved essential progress on an excellent performance by converting ANN to SNN. Due to the difference in information processing, the converted deep SNN usually suffers serious performance loss and large time delay. In this paper, we analyze the reasons for the performance loss and propose a novel bistable spiking neural network (BSNN) that addresses the problem of spikes of inactivated neurons (SIN) caused by the phase lead and phase lag. Also, when ResNet structure-based ANNs are converted, the information of output neurons is incomplete due to the rapid transmission of the shortcut path. We design synchronous neurons (SN) to help efficiently improve performance. Experimental results show that the proposed method only needs 1/4-1/10 of the time steps compared to previous work to achieve nearly lossless conversion. We demonstrate state-of-the-art ANN-SNN conversion for VGG16, ResNet20, and ResNet34 on challenging datasets including CIFAR-10 (95.16% top-1), CIFAR-100 (78.12% top-1), and ImageNet (72.64% top-1).
Machine learning implements backpropagation via abundant training samples. We demonstrate a multi-stage learning system realized by a promising non-volatile memory device, the domain-wall magnetic tunnel junction (DW-MTJ). The system consists of unsu pervised (clustering) as well as supervised sub-systems, and generalizes quickly (with few samples). We demonstrate interactions between physical properties of this device and optimal implementation of neuroscience-inspired plasticity learning rules, and highlight performance on a suite of tasks. Our energy analysis confirms the value of the approach, as the learning budget stays below 20 $mu J$ even for large tasks used typically in machine learning.
We simulated our nanomagnet reservoir computer (NMRC) design on benchmark tasks, demonstrating NMRCs high memory content and expressibility. In support of the feasibility of this method, we fabricated a frustrated nanomagnet reservoir layer. Using th is structure, we describe a low-power, low-area system with an area-energy-delay product $10^7$ lower than conventional RC systems, that is therefore promising for size, weight, and power (SWaP) constrained applications.
231 - Ewan Orr , Ben Martin 2011
We investigate Turings notion of an A-type artificial neural network. We study a refinement of Turings original idea, motivated by work of Teuscher, Bull, Preen and Copeland. Our A-types can process binary data by accepting and outputting sequences o f binary vectors; hence we can associate a function to an A-type, and we say the A-type {em represents} the function. There are two modes of data processing: clamped and sequential. We describe an evolutionary algorithm, involving graph-theoretic manipulations of A-types, which searches for A-types representing a given function. The algorithm uses both mutation and crossover operators. We implemented the algorithm and applied it to three benchmark tasks. We found that the algorithm performed much better than a random search. For two out of the three tasks, the algorithm with crossover performed better than a mutation-only version.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا