ﻻ يوجد ملخص باللغة العربية
Channel pruning is a promising technique to compress the parameters of deep convolutional neural networks(DCNN) and to speed up the inference. This paper aims to address the long-standing inefficiency of channel pruning. Most channel pruning methods recover the prediction accuracy by re-training the pruned model from the remaining parameters or random initialization. This re-training process is heavily dependent on the sufficiency of computational resources, training data, and human interference(tuning the training strategy). In this paper, a highly efficient pruning method is proposed to significantly reduce the cost of pruning DCNN. The main contributions of our method include: 1) pruning compensation, a fast and data-efficient substitute of re-training to minimize the post-pruning reconstruction loss of features, 2) compensation-aware pruning(CaP), a novel pruning algorithm to remove redundant or less-weighted channels by minimizing the loss of information, and 3) binary structural search with step constraint to minimize human interference. On benchmarks including CIFAR-10/100 and ImageNet, our method shows competitive pruning performance among the state-of-the-art retraining-based pruning methods and, more importantly, reduces the processing time by 95% and data usage by 90%.
In this paper, we introduce a new channel pruning method to accelerate very deep convolutional neural networks.Given a trained CNN model, we propose an iterative two-step algorithm to effectively prune each layer, by a LASSO regression based channel
Model compression aims to reduce the redundancy of deep networks to obtain compact models. Recently, channel pruning has become one of the predominant compression methods to deploy deep models on resource-constrained devices. Most channel pruning met
To apply deep CNNs to mobile terminals and portable devices, many scholars have recently worked on the compressing and accelerating deep convolutional neural networks. Based on this, we propose a novel uniform channel pruning (UCP) method to prune de
The performance of Deep Neural Networks (DNNs) keeps elevating in recent years with increasing network depth and width. To enable DNNs on edge devices like mobile phones, researchers proposed several network compression methods including pruning, qua
To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposit