No Arabic abstract
The emerging brain-inspired computing paradigm known as hyperdimensional computing (HDC) has been proven to provide a lightweight learning framework for various cognitive tasks compared to the widely used deep learning-based approaches. Spatio-temporal (ST) signal processing, which encompasses biosignals such as electromyography (EMG) and electroencephalography (EEG), is one family of applications that could benefit from an HDC-based learning framework. At the core of HDC lie manipulations and comparisons of large bit patterns, which are inherently ill-suited to conventional computing platforms based on the von-Neumann architecture. In this work, we propose an architecture for ST signal processing within the HDC framework using predominantly in-memory compute arrays. In particular, we introduce a methodology for the in-memory hyperdimensional encoding of ST data to be used together with an in-memory associative search module. We show that the in-memory HDC encoder for ST signals offers at least 1.80x energy efficiency gains, 3.36x area gains, as well as 9.74x throughput gains compared with a dedicated digital hardware implementation. At the same time it achieves a peak classification accuracy within 0.04% of that of the baseline HDC framework.
Hyperdimensional Computing (HDC) is an emerging computational framework that mimics important brain functions by operating over high-dimensional vectors, called hypervectors (HVs). In-memory computing implementations of HDC are desirable since they can significantly reduce data transfer overheads. All existing in-memory HDC platforms consider binary HVs where each dimension is represented with a single bit. However, utilizing multi-bit HVs allows HDC to achieve acceptable accuracies in lower dimensions which in turn leads to higher energy efficiencies. Thus, we propose a highly accurate and efficient multi-bit in-memory HDC inference platform called MIMHD. MIMHD supports multi-bit operations using ferroelectric field-effect transistor (FeFET) crossbar arrays for multiply-and-add and FeFET multi-bit content-addressable memories for associative search. We also introduce a novel hardware-aware retraining framework (HWART) that trains the HDC model to learn to work with MIMHD. For six popular datasets and 4000 dimension HVs, MIMHD using 3-bit (2-bit) precision HVs achieves (i) average accuracies of 92.6% (88.9%) which is 8.5% (4.8%) higher than binary implementations; (ii) 84.1x (78.6x) energy improvement over a GPU, and (iii) 38.4x (34.3x) speedup over a GPU, respectively. The 3-bit $times$ is 4.3x and 13x faster and more energy-efficient than binary HDC accelerators while achieving similar accuracies.
Emulating various facets of computing principles of the brain can potentially lead to the development of neuro-computers that are able to exhibit brain-like cognitive capabilities. In this letter, we propose a magnetoelectronic neuron that utilizes noise as a computing resource and is able to encode information over time through the independent control of external voltage signals. We extensively characterize the device operation using simulations and demonstrate its suitability for neuromorphic computing platforms performing temporal information encoding.
This work presents the design and analysis of a mixed-signal neuron (MS-N) for convolutional neural networks (CNN) and compares its performance with a digital neuron (Dig-N) in terms of operating frequency, power and noise. The circuit-level implementation of the MS-N in 65 nm CMOS technology exhibits 2-3 orders of magnitude better energy-efficiency over Dig-N for neuromorphic computing applications - especially at low frequencies due to the high leakage currents from many transistors in Dig-N. The inherent error-resiliency of CNN is exploited to handle the thermal and flicker noise of MS-N. A system-level analysis using a cohesive circuit-algorithmic framework on MNIST and CIFAR-10 datasets demonstrate an increase of 3% in worst-case classification error for MNIST when the integrated noise power in the bandwidth is ~ 1 {mu}V2.
One viable solution for continuous reduction in energy-per-operation is to rethink functionality to cope with uncertainty by adopting computational approaches that are inherently robust to uncertainty. It requires a novel look at data representations, associated operations, and circuits, and at materials and substrates that enable them. 3D integrated nanotechnologies combined with novel brain-inspired computational paradigms that support fast learning and fault tolerance could lead the way. Recognizing the very size of the brains circuits, hyperdimensional (HD) computing can model neural activity patterns with points in a HD space, that is, with hypervectors as large randomly generated patterns. At its very core, HD computing is about manipulating and comparing these patterns inside memory. Emerging nanotechnologies such as carbon nanotube field effect transistors (CNFETs) and resistive RAM (RRAM), and their monolithic 3D integration offer opportunities for hardware implementations of HD computing through tight integration of logic and memory, energy-efficient computation, and unique device characteristics. We experimentally demonstrate and characterize an end-to-end HD computing nanosystem built using monolithic 3D integration of CNFETs and RRAM. With our nanosystem, we experimentally demonstrate classification of 21 languages with measured accuracy of up to 98% on >20,000 sentences (6.4 million characters), training using one text sample (~100,000 characters) per language, and resilient operation (98% accuracy) despite 78% hardware errors in HD representation (outputs stuck at 0 or 1). By exploiting the unique properties of the underlying nanotechnologies, we show that HD computing, when implemented with monolithic 3D integration, can be up to 420X more energy-efficient while using 25X less area compared to traditional silicon CMOS implementations.
We propose a dedicated winner-take-all circuit to efficiently implement the intra-column competition between cells in Hierarchical Temporal Memory which is a crucial part of the algorithm. All inputs and outputs are charge-based for compatibility with standard CMOS. The circuit incorporates memristors for competitive advantage to emulate a column with a cell in a predictive state. The circuit can also detect columns bursting by passive averaging and comparison of the cell outputs. The proposed spintronic devices and circuit are thoroughly described and a series of simulations are used to predict the performance. The simulations indicate that the circuit can complete a nine-cell, nine-input competition operation in under 15 ns at a cost of about 25 pJ.