Demystifying Neural Network Filter Pruning

120 0 0.0 ( 0 )

Download Cite

Added by Zhuwei Qin

Publication date 2018

fields Informatics Engineering

and research's language is English

Authors Zhuwei Qin - Fuxun Yu - ChenChen Liu

Computer Vision and Pattern Recognition

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Based on filter magnitude ranking (e.g. L1 norm), conventional filter pruning methods for Convolutional Neural Networks (CNNs) have been proved with great effectiveness in computation load reduction. Although effective, these methods are rarely analyzed in a perspective of filter functionality. In this work, we explore the filter pruning and the retraining through qualitative filter functionality interpretation. We find that the filter magnitude based method fails to eliminate the filters with repetitive functionality. And the retraining phase is actually used to reconstruct the remained filters for functionality compensation for the wrongly-pruned critical filters. With a proposed functionality-oriented pruning method, we further testify that, by precisely addressing the filter functionality redundancy, a CNN can be pruned without considerable accuracy drop, and the retraining phase is unnecessary.

rate research

Pruning Filter in Filter

113 - Fanxu Meng , Hao Cheng , Ke Li 2020

Pruning has become a very powerful and effective technique to compress and accelerate modern neural networks. Existing pruning methods can be grouped into two categories: filter pruning (FP) and weight pruning (WP). FP wins at hardware compatibility but loses at the compression ratio compared with WP. To converge the strength of both methods, we propose to prune the filter in the filter. Specifically, we treat a filter $F in mathbb{R}^{Ctimes Ktimes K}$ as $K times K$ stripes, i.e., $1times 1$ filters $in mathbb{R}^{C}$, then by pruning the stripes instead of the whole filter, we can achieve finer granularity than traditional FP while being hardware friendly. We term our method as SWP (emph{Stripe-Wise Pruning}). SWP is implemented by introducing a novel learnable matrix called Filter Skeleton, whose values reflect the shape of each filter. As some recent work has shown that the pruned architecture is more crucial than the inherited important weights, we argue that the architecture of a single filter, i.e., the shape, also matters. Through extensive experiments, we demonstrate that SWP is more effective compared to the previous FP-based methods and achieves the state-of-art pruning ratio on CIFAR-10 and ImageNet datasets without obvious accuracy drop. Code is available at https://github.com/fxmeng/Pruning-Filter-in-Filter

Computer Vision and Pattern Recognition

GHFP: Gradually Hard Filter Pruning

88 - Linhang Cai , Zhulin An , Yongjun Xu 2020

Filter pruning is widely used to reduce the computation of deep learning, enabling the deployment of Deep Neural Networks (DNNs) in resource-limited devices. Conventional Hard Filter Pruning (HFP) method zeroizes pruned filters and stops updating them, thus reducing the search space of the model. On the contrary, Soft Filter Pruning (SFP) simply zeroizes pruned filters, keeping updating them in the following training epochs, thus maintaining the capacity of the network. However, SFP, together with its variants, converges much slower than HFP due to its larger search space. Our question is whether SFP-based methods and HFP can be combined to achieve better performance and speed up convergence. Firstly, we generalize SFP-based methods and HFP to analyze their characteristics. Then we propose a Gradually Hard Filter Pruning (GHFP) method to smoothly switch from SFP-based methods to HFP during training and pruning, thus maintaining a large search space at first, gradually reducing the capacity of the model to ensure a moderate convergence speed. Experimental results on CIFAR-10/100 show that our method achieves the state-of-the-art performance.

Computer Vision and Pattern Recognition Machine Learning Image and Video Processing

SCOP: Scientific Control for Reliable Neural Network Pruning

229 - Yehui Tang , Yunhe Wang , Yixing Xu 2020

This paper proposes a reliable neural network pruning algorithm by setting up a scientific control. Existing pruning methods have developed various hypotheses to approximate the importance of filters to the network and then execute filter pruning accordingly. To increase the reliability of the results, we prefer to have a more rigorous research design by including a scientific control group as an essential part to minimize the effect of all factors except the association between the filter and expected network output. Acting as a control group, knockoff feature is generated to mimic the feature map produced by the network filter, but they are conditionally independent of the example label given the real feature map. We theoretically suggest that the knockoff condition can be approximately preserved given the information propagation of network layers. Besides the real feature map on an intermediate layer, the corresponding knockoff feature is brought in as another auxiliary input signal for the subsequent layers. Redundant filters can be discovered in the adversarial process of different features. Through experiments, we demonstrate the superiority of the proposed algorithm over state-of-the-art methods. For example, our method can reduce 57.8% parameters and 60.2% FLOPs of ResNet-101 with only 0.01% top-1 accuracy loss on ImageNet. The code is available at https://github.com/huawei-noah/Pruning/tree/master/SCOP_NeurIPS2020.

Computer Vision and Pattern Recognition Machine Learning

Weight-dependent Gates for Differentiable Neural Network Pruning

130 - Yun Li , Weiqun Wu , Zechun Liu 2020

In this paper, we propose a simple and effective network pruning framework, which introduces novel weight-dependent gates to prune filter adaptively. We argue that the pruning decision should depend on the convolutional weights, in other words, it should be a learnable function of filter weights. We thus construct the weight-dependent gates (W-Gates) to learn the information from filter weights and obtain binary filter gates to prune or keep the filters automatically. To prune the network under hardware constraint, we train a Latency Predict Net (LPNet) to estimate the hardware latency of candidate pruned networks. Based on the proposed LPNet, we can optimize W-Gates and the pruning ratio of each layer under latency constraint. The whole framework is differentiable and can be optimized by gradient-based method to achieve a compact network with better trade-off between accuracy and efficiency. We have demonstrated the effectiveness of our method on Resnet34, Resnet50 and MobileNet V2, achieving up to 1.33/1.28/1.1 higher Top-1 accuracy with lower hardware latency on ImageNet. Compared with state-of-the-art pruning methods, our method achieves superior performance.

Computer Vision and Pattern Recognition

Provable Filter Pruning for Efficient Neural Networks

650 - Lucas Liebenwein , Cenk Baykal , Harry Lang 2019

We present a provable, sampling-based approach for generating compact Convolutional Neural Networks (CNNs) by identifying and removing redundant filters from an over-parameterized network. Our algorithm uses a small batch of input data points to assign a saliency score to each filter and constructs an importance sampling distribution where filters that highly affect the output are sampled with correspondingly high probability. In contrast to existing filter pruning approaches, our method is simultaneously data-informed, exhibits provable guarantees on the size and performance of the pruned network, and is widely applicable to varying network architectures and data sets. Our analytical bounds bridge the notions of compressibility and importance of network structures, which gives rise to a fully-automated procedure for identifying and preserving filters in layers that are essential to the networks performance. Our experimental evaluations on popular architectures and data sets show that our algorithm consistently generates sparser and more efficient models than those constructed by existing filter pruning approaches.

Machine Learning Machine Learning