Convolutional neural networks (CNN) have achieved excellent performance on various tasks, but deploying CNN to edge is constrained by the high energy consumption of convolution operation. Stochastic computing (SC) is an attractive paradigm which performs arithmetic operations with simple logic gates and low hardware cost. This paper presents an energy-efficient mixed-signal multiply-accumulate (MAC) engine based on SC. A parallel architecture is adopted in this work to solve the latency problem of SC. The simulation results show that the overall energy consumption of our design is 5.03pJ per 26-input MAC operation under 28nm CMOS technology.
Convolutional neural network (CNN) achieves excellent performance on fascinating tasks such as image recognition and natural language processing at the cost of high power consumption. Stochastic computing (SC) is an attractive paradigm implemented in low power applications which performs arithmetic operations with simple logic and low hardware cost. However, conventional memory structure designed and optimized for binary computing leads to extra data conversion costs, which significantly decreases the energy efficiency. Therefore, a new memory system designed for SC-based multiply-accumulate (MAC) engine applied in CNN which is compatible with conventional memory system is proposed in this paper. As a result, the overall energy consumption of our new computing structure is 0.91pJ, which is reduced by 82.1% compared with the conventional structure, and the energy efficiency achieves 164.8 TOPS/W.
This work presents the design and analysis of a mixed-signal neuron (MS-N) for convolutional neural networks (CNN) and compares its performance with a digital neuron (Dig-N) in terms of operating frequency, power and noise. The circuit-level implementation of the MS-N in 65 nm CMOS technology exhibits 2-3 orders of magnitude better energy-efficiency over Dig-N for neuromorphic computing applications - especially at low frequencies due to the high leakage currents from many transistors in Dig-N. The inherent error-resiliency of CNN is exploited to handle the thermal and flicker noise of MS-N. A system-level analysis using a cohesive circuit-algorithmic framework on MNIST and CIFAR-10 datasets demonstrate an increase of 3% in worst-case classification error for MNIST when the integrated noise power in the bandwidth is ~ 1 {mu}V2.
A neural network is essentially a high-dimensional complex mapping model by adjusting network weights for feature fitting. However, the spectral bias in network training leads to unbearable training epochs for fitting the high-frequency components in broadband signals. To improve the fitting efficiency of high-frequency components, the PhaseDNN was proposed recently by combining complex frequency band extraction and frequency shift techniques [Cai et al. SIAM J. SCI. COMPUT. 42, A3285 (2020)]. Our paper is devoted to an alternative candidate for fitting complex signals with high-frequency components. Here, a parallel frequency function-deep neural network (PFF-DNN) is proposed to suppress computational overhead while ensuring fitting accuracy by utilizing fast Fourier analysis of broadband signals and the spectral bias nature of neural networks. The effectiveness and efficiency of the proposed PFF-DNN method are verified based on detailed numerical experiments for six typical broadband signals.
Superparamagnetic tunnel junctions (SMTJs) have emerged as a competitive, realistic nanotechnology to support novel forms of stochastic computation in CMOS-compatible platforms. One of their applications is to generate random bitstreams suitable for use in stochastic computing implementations. We describe a method for digitally programmable bitstream generation based on pre-charge sense amplifiers. This generator is significantly more energy efficient than SMTJ-based bitstream generators that tune probabilities with spin currents and a factor of two more efficient than related CMOS-based implementations. The true randomness of this bitstream generator allows us to use them as the fundamental units of a novel neural network architecture. To take advantage of the potential savings, we codesign the algorithm with the circuit, rather than directly transcribing a classical neural network into hardware. The flexibility of the neural network mathematics allows us to adapt the network to the explicitly energy efficient choices we make at the device level. The result is a convolutional neural network design operating at $approx$ 150 nJ per inference with 97 % performance on MNIST -- a factor of 1.4 to 7.7 improvement in energy efficiency over comparable proposals in the recent literature.
The availability of inexpensive devices allows nowadays to implement cognitive radio functionalities in large-scale networks such as the internet-of-things and future mobile cellular systems. In this paper, we focus on wideband spectrum sensing in the presence of oversampling, i.e., the sampling frequency of a digital receiver is larger than the signal bandwidth, where signal detection must take into account the front-end impairments of low-cost devices. Based on the noise model of a software-defined radio dongle, we address the problem of robust signal detection in the presence of noise power uncertainty and non-flat noise power spectral density (PSD). In particular, we analyze the receiver operating characteristic of several detectors in the presence of such front-end impairments, to assess the performance attainable in a real-world scenario. We propose new frequency-domain detectors, some of which are proven to outperform previously proposed spectrum sensing techniques such as, e.g., eigenvalue-based tests. The study shows that the best performance is provided by a noise-uncertainty immune energy detector (ED) and, for the colored noise case, by tests that match the PSD of the receiver noise.