ﻻ يوجد ملخص باللغة العربية
Previous AutoML pruning works utilized individual layer features to automatically prune filters. We analyze the correlation for two layers from different blocks which have a short-cut structure. It is found that, in one block, the deeper layer has many redundant filters which can be represented by filters in the former layer so that it is necessary to take information from other layers into consideration in pruning. In this paper, a graph pruning approach is proposed, which views any deep model as a topology graph. Graph PruningNet based on the graph convolution network is designed to automatically extract neighboring information for each node. To extract features from various topologies, Graph PruningNet is connected with Pruned Network by an individual fully connection layer for each node and jointly trained on a training dataset from scratch. Thus, we can obtain reasonable weights for any size of sub-network. We then search the best configuration of the Pruned Network by reinforcement learning. Different from previous work, we take the node features from well-trained Graph PruningNet, instead of the hand-craft features, as the states in reinforcement learning. Compared with other AutoML pruning works, our method has achieved the state-of-the-art under same conditions on ImageNet-2012. The code will be released on GitHub.
Auto-ML pruning methods aim at searching a pruning strategy automatically to reduce the computational complexity of deep Convolutional Neural Networks(deep CNNs). However, some previous works found that the results of many Auto-ML pruning methods eve
In the traditional deep compression framework, iteratively performing network pruning and quantization can reduce the model size and computation cost to meet the deployment requirements. However, such a step-wise application of pruning and quantizati
Network compression has been widely studied since it is able to reduce the memory and computation cost during inference. However, previous methods seldom deal with complicated structures like residual connections, group/depth-wise convolution and fea
We study the neural network (NN) compression problem, viewing the tension between the compression ratio and NN performance through the lens of rate-distortion theory. We choose a distortion metric that reflects the effect of NN compression on the mod
In recent years, deep neural networks have achieved great success in the field of computer vision. However, it is still a big challenge to deploy these deep models on resource-constrained embedded devices such as mobile robots, smart phones and so on