ﻻ يوجد ملخص باللغة العربية
Training deep learning networks involves continuous weight updates across the various layers of the deep network while using a backpropagation algorithm (BP). This results in expensive computation overheads during training. Consequently, most deep learning accelerators today employ pre-trained weights and focus only on improving the design of the inference phase. The recent trend is to build a complete deep learning accelerator by incorporating the training module. Such efforts require an ultra-fast chip architecture for executing the BP algorithm. In this article, we propose a novel photonics-based backpropagation accelerator for high performance deep learning training. We present the design for a convolutional neural network, BPLight-CNN, which incorporates the silicon photonics-based backpropagation accelerator. BPLight-CNN is a first-of-its-kind photonic and memristor-based CNN architecture for end-to-end training and prediction. We evaluate BPLight-CNN using a photonic CAD framework (IPKISS) on deep learning benchmark models including LeNet and VGG-Net. The proposed design achieves (i) at least 34x speedup, 34x improvement in computational efficiency, and 38.5x energy savings, during training; and (ii) 29x speedup, 31x improvement in computational efficiency, and 38.7x improvement in energy savings, during inference compared to the state-of-the-art designs. All these comparisons are done at a 16-bit resolution; and BPLight-CNN achieves these improvements at a cost of approximately 6% lower accuracy compared to the state-of-the-art.
Sparse neural networks can greatly facilitate the deployment of neural networks on resource-constrained platforms as they offer compact model sizes while retaining inference accuracy. Because of the sparsity in parameter matrices, sparse neural netwo
The increasing application of deep learning technology drives the need for an efficient parallel computing architecture for Convolutional Neural Networks (CNNs). A significant challenge faced when designing a many-core CNN accelerator is to handle th
The recent breakthroughs and prohibitive complexities of Deep Neural Networks (DNNs) have excited extensive interest in domain-specific DNN accelerators, among which optical DNN accelerators are particularly promising thanks to their unprecedented po
Memristor crossbars are circuits capable of performing analog matrix-vector multiplications, overcoming the fundamental energy efficiency limitations of digital logic. They have been shown to be effective in special-purpose accelerators for a limited
In this paper, we propose a very compact embedded CNN processor design based on a modified logarithmic computing method using very low bit-width representation. Our high-quality CNN processor can easily fit into edge devices. For Yolov2, our processi