ﻻ يوجد ملخص باللغة العربية
Auto-ML pruning methods aim at searching a pruning strategy automatically to reduce the computational complexity of deep Convolutional Neural Networks(deep CNNs). However, some previous works found that the results of many Auto-ML pruning methods even cannot surpass the results of the uniformly pruning method. In this paper, we first analyze the reason for the ineffectiveness of Auto-ML pruning. Subsequently, a stage-wise pruning(SP) method is proposed to solve the above problem. As with most of the previous Auto-ML pruning methods, SP also trains a super-net that can provide proxy performance for sub-nets and search the best sub-net who has the best proxy performance. Different from previous works, we split a deep CNN into several stages and use a full-net where all layers are not pruned to supervise the training and the searching of sub-nets. Remarkably, the proxy performance of sub-nets trained with SP is closer to the actual performance than most of the previous Auto-ML pruning works. Therefore, SP achieves the state-of-the-art on both CIFAR-10 and ImageNet under the mobile setting.
Previous AutoML pruning works utilized individual layer features to automatically prune filters. We analyze the correlation for two layers from different blocks which have a short-cut structure. It is found that, in one block, the deeper layer has ma
In recent years, deep neural networks have achieved great success in the field of computer vision. However, it is still a big challenge to deploy these deep models on resource-constrained embedded devices such as mobile robots, smart phones and so on
To apply deep CNNs to mobile terminals and portable devices, many scholars have recently worked on the compressing and accelerating deep convolutional neural networks. Based on this, we propose a novel uniform channel pruning (UCP) method to prune de
In the traditional deep compression framework, iteratively performing network pruning and quantization can reduce the model size and computation cost to meet the deployment requirements. However, such a step-wise application of pruning and quantizati
Network compression has been widely studied since it is able to reduce the memory and computation cost during inference. However, previous methods seldom deal with complicated structures like residual connections, group/depth-wise convolution and fea