No Arabic abstract
In this paper, we present a novel approach for fine-tuning a decoder-side neural network in the context of image compression, such that the weight-updates are better compressible. At encoder side, we fine-tune a pre-trained artifact removal network on target data by using a compression objective applied on the weight-update. In particular, the compression objective encourages weight-updates which are sparse and closer to quantized values. This way, the final weight-update can be compressed more efficiently by pruning and quantization, and can be included into the encoded bitstream together with the image bitstream of a traditional codec. We show that this approach achieves reconstruction quality which is on-par or slightly superior to a traditional codec, at comparable bitrates. To our knowledge, this is the first attempt to combine image compression and neural networks weight update compression.
As the complexity of deep neural networks (DNNs) trend to grow to absorb the increasing sizes of data, memory and energy consumption has been receiving more and more attentions for industrial applications, especially on mobile devices. This paper presents a novel structure based on functional hashing to compress DNNs, namely FunHashNN. For each entry in a deep net, FunHashNN uses multiple low-cost hash functions to fetch values in the compression space, and then employs a small reconstruction network to recover that entry. The reconstruction network is plugged into the whole network and trained jointly. FunHashNN includes the recently proposed HashedNets as a degenerated case, and benefits from larger value capacity and less reconstruction loss. We further discuss extensions with dual space hashing and multi-hops. On several benchmark datasets, FunHashNN demonstrates high compression ratios with little loss on prediction accuracy.
We present an efficient coresets-based neural network compression algorithm that sparsifies the parameters of a trained fully-connected neural network in a manner that provably approximates the networks output. Our approach is based on an importance sampling scheme that judiciously defines a sampling distribution over the neural network parameters, and as a result, retains parameters of high importance while discarding redundant ones. We leverage a novel, empirical notion of sensitivity and extend traditional coreset constructions to the application of compressing parameters. Our theoretical analysis establishes guarantees on the size and accuracy of the resulting compressed network and gives rise to generalization bounds that may provide new insights into the generalization properties of neural networks. We demonstrate the practical effectiveness of our algorithm on a variety of neural network configurations and real-world data sets.
A deep neural network is a parametrization of a multilayer mapping of signals in terms of many alternatively arranged linear and nonlinear transformations. The linear transformations, which are generally used in the fully connected as well as convolutional layers, contain most of the variational parameters that are trained and stored. Compressing a deep neural network to reduce its number of variational parameters but not its prediction power is an important but challenging problem toward the establishment of an optimized scheme in training efficiently these parameters and in lowering the risk of overfitting. Here we show that this problem can be effectively solved by representing linear transformations with matrix product operators (MPOs), which is a tensor network originally proposed in physics to characterize the short-range entanglement in one-dimensional quantum states. We have tested this approach in five typical neural networks, including FC2, LeNet-5, VGG, ResNet, and DenseNet on two widely used data sets, namely, MNIST and CIFAR-10, and found that this MPO representation indeed sets up a faithful and efficient mapping between input and output signals, which can keep or even improve the prediction accuracy with a dramatically reduced number of parameters. Our method greatly simplifies the representations in deep learning, and opens a possible route toward establishing a framework of modern neural networks which might be simpler and cheaper, but more efficient.
We describe an adversarial learning approach to constrain convolutional neural network training for image registration, replacing heuristic smoothness measures of displacement fields often used in these tasks. Using minimally-invasive prostate cancer intervention as an example application, we demonstrate the feasibility of utilizing biomechanical simulations to regularize a weakly-supervised anatomical-label-driven registration network for aligning pre-procedural magnetic resonance (MR) and 3D intra-procedural transrectal ultrasound (TRUS) images. A discriminator network is optimized to distinguish the registration-predicted displacement fields from the motion data simulated by finite element analysis. During training, the registration network simultaneously aims to maximize similarity between anatomical labels that drives image alignment and to minimize an adversarial generator loss that measures divergence between the predicted- and simulated deformation. The end-to-end trained network enables efficient and fully-automated registration that only requires an MR and TRUS image pair as input, without anatomical labels or simulated data during inference. 108 pairs of labelled MR and TRUS images from 76 prostate cancer patients and 71,500 nonlinear finite-element simulations from 143 different patients were used for this study. We show that, with only gland segmentation as training labels, the proposed method can help predict physically plausible deformation without any other smoothness penalty. Based on cross-validation experiments using 834 pairs of independent validation landmarks, the proposed adversarial-regularized registration achieved a target registration error of 6.3 mm that is significantly lower than those from several other regularization methods.
Heavy-tailed distributions have been studied in statistics, random matrix theory, physics, and econometrics as models of correlated systems, among other domains. Further, heavy-tail distributed eigenvalues of the covariance matrix of the weight matrices in neural networks have been shown to empirically correlate with test set accuracy in several works (e.g. arXiv:1901.08276), but a formal relationship between heavy-tail distributed parameters and generalization bounds was yet to be demonstrated. In this work, the compression framework of arXiv:1802.05296 is utilized to show that matrices with heavy-tail distributed matrix elements can be compressed, resulting in networks with sparse weight matrices. Since the parameter count has been reduced to a sum of the non-zero elements of sparse matrices, the compression framework allows us to bound the generalization gap of the resulting compressed network with a non-vacuous generalization bound. Further, the action of these matrices on a vector is discussed, and how they may relate to compression and resilient classification is analyzed.