ﻻ يوجد ملخص باللغة العربية
Sparsification is an efficient approach to accelerate CNN inference, but it is challenging to take advantage of sparsity in training procedure because the involved gradients are dynamically changed. Actually, an important observation shows that most of the activation gradients in back-propagation are very close to zero and only have a tiny impact on weight-updating. Hence, we consider pruning these very small gradients randomly to accelerate CNN training according to the statistical distribution of activation gradients. Meanwhile, we theoretically analyze the impact of pruning algorithm on the convergence. The proposed approach is evaluated on AlexNet and ResNet-{18, 34, 50, 101, 152} with CIFAR-{10, 100} and ImageNet datasets. Experimental results show that our training approach could substantially achieve up to $5.92 times$ speedups at back-propagation stage with negligible accuracy loss.
It is not easy to design and run Convolutional Neural Networks (CNNs) due to: 1) finding the optimal number of filters (i.e., the width) at each layer is tricky, given an architecture; and 2) the computational intensity of CNNs impedes the deployment
The increasing size of neural network models has been critical for improvements in their accuracy, but device memory is not growing at the same rate. This creates fundamental challenges for training neural networks within limited memory environments.
In this paper, we propose an adaptive pruning method. This method can cut off the channel and layer adaptively. The proportion of the layer and the channel to be cut is learned adaptively. The pruning method proposed in this paper can reduce half of
This work aims to enable on-device training of convolutional neural networks (CNNs) by reducing the computation cost at training time. CNN models are usually trained on high-performance computers and only the trained models are deployed to edge devic
We propose ResRep, a novel method for lossless channel pruning (a.k.a. filter pruning), which slims down a CNN by reducing the width (number of output channels) of convolutional layers. Inspired by the neurobiology research about the independence of