ﻻ يوجد ملخص باللغة العربية
Filter pruning is widely used to reduce the computation of deep learning, enabling the deployment of Deep Neural Networks (DNNs) in resource-limited devices. Conventional Hard Filter Pruning (HFP) method zeroizes pruned filters and stops updating them, thus reducing the search space of the model. On the contrary, Soft Filter Pruning (SFP) simply zeroizes pruned filters, keeping updating them in the following training epochs, thus maintaining the capacity of the network. However, SFP, together with its variants, converges much slower than HFP due to its larger search space. Our question is whether SFP-based methods and HFP can be combined to achieve better performance and speed up convergence. Firstly, we generalize SFP-based methods and HFP to analyze their characteristics. Then we propose a Gradually Hard Filter Pruning (GHFP) method to smoothly switch from SFP-based methods to HFP during training and pruning, thus maintaining a large search space at first, gradually reducing the capacity of the model to ensure a moderate convergence speed. Experimental results on CIFAR-10/100 show that our method achieves the state-of-the-art performance.
Pruning has become a very powerful and effective technique to compress and accelerate modern neural networks. Existing pruning methods can be grouped into two categories: filter pruning (FP) and weight pruning (WP). FP wins at hardware compatibility
Based on filter magnitude ranking (e.g. L1 norm), conventional filter pruning methods for Convolutional Neural Networks (CNNs) have been proved with great effectiveness in computation load reduction. Although effective, these methods are rarely analy
The existence of redundancy in Convolutional Neural Networks (CNNs) enables us to remove some filters/channels with acceptable performance drops. However, the training objective of CNNs usually tends to minimize an accuracy-related loss function with
Channel pruning has demonstrated its effectiveness in compressing ConvNets. In many related arts, the importance of an output feature map is only determined by its associated filter. However, these methods ignore a small part of weights in the next l
Neural network pruning offers a promising prospect to facilitate deploying deep neural networks on resource-limited devices. However, existing methods are still challenged by the training inefficiency and labor cost in pruning designs, due to missing