ﻻ يوجد ملخص باللغة العربية
Recent research on deep neural networks (DNNs) has primarily focused on improving the model accuracy. Given a proper deep learning framework, it is generally possible to increase the depth or layer width to achieve a higher level of accuracy. However, the huge number of model parameters imposes more computational and memory usage overhead and leads to the parameter redundancy. In this paper, we address the parameter redundancy problem in DNNs by replacing conventional full projections with bilinear projections. For a fully-connected layer with $D$ input nodes and $D$ output nodes, applying bilinear projection can reduce the model space complexity from $mathcal{O}(D^2)$ to $mathcal{O}(2D)$, achieving a deep model with a sub-linear layer size. However, structured projection has a lower freedom of degree compared to the full projection, causing the under-fitting problem. So we simply scale up the mapping size by increasing the number of output channels, which can keep and even boosts the model accuracy. This makes it very parameter-efficient and handy to deploy such deep models on mobile systems with memory limitations. Experiments on four benchmark datasets show that applying the proposed bilinear projection to deep neural networks can achieve even higher accuracies than conventional full DNNs, while significantly reduces the model size.
Channel pruning is a promising technique to compress the parameters of deep convolutional neural networks(DCNN) and to speed up the inference. This paper aims to address the long-standing inefficiency of channel pruning. Most channel pruning methods
The performance of Deep Neural Networks (DNNs) keeps elevating in recent years with increasing network depth and width. To enable DNNs on edge devices like mobile phones, researchers proposed several network compression methods including pruning, qua
To accelerate DNNs inference, low-rank approximation has been widely adopted because of its solid theoretical rationale and efficient implementations. Several previous works attempted to directly approximate a pre-trained model by low-rank decomposit
Recently, channel attention mechanism has demonstrated to offer great potential in improving the performance of deep convolutional neural networks (CNNs). However, most existing methods dedicate to developing more sophisticated attention modules for
Parameters in deep neural networks which are trained on large-scale databases can generalize across multiple domains, which is referred as transferability. Unfortunately, the transferability is usually defined as discrete states and it differs with d