ﻻ يوجد ملخص باللغة العربية
To deploy a well-trained CNN model on low-end computation edge devices, it is usually supposed to compress or prune the model under certain computation budget (e.g., FLOPs). Current filter pruning methods mainly leverage feature maps to generate important scores for filters and prune those with smaller scores, which ignores the variance of input batches to the difference in sparse structure over filters. In this paper, we propose a data agnostic filter pruning method that uses an auxiliary network named Dagger module to induce pruning and takes pretrained weights as input to learn the importance of each filter. In addition, to help prune filters with certain FLOPs constraints, we leverage an explicit FLOPs-aware regularization to directly promote pruning filters toward target FLOPs. Extensive experimental results on CIFAR-10 and ImageNet datasets indicate our superiority to other state-of-the-art filter pruning methods. For example, our 50% FLOPs ResNet-50 can achieve 76.1% Top-1 accuracy on ImageNet dataset, surpassing many other filter pruning methods.
The existence of redundancy in Convolutional Neural Networks (CNNs) enables us to remove some filters/channels with acceptable performance drops. However, the training objective of CNNs usually tends to minimize an accuracy-related loss function with
In this paper, we present an approach for Recurrent Iterative Gating called RIGNet. The core elements of RIGNet involve recurrent connections that control the flow of information in neural networks in a top-down manner, and different variants on the
In this paper, we present a canonical structure for controlling information flow in neural networks with an efficient feedback routing mechanism based on a strategy of Distributed Iterative Gating (DIGNet). The structure of this mechanism derives fro
The performance of Deep Neural Networks (DNNs) keeps elevating in recent years with increasing network depth and width. To enable DNNs on edge devices like mobile phones, researchers proposed several network compression methods including pruning, qua
To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposit