ﻻ يوجد ملخص باللغة العربية
We study the neural network (NN) compression problem, viewing the tension between the compression ratio and NN performance through the lens of rate-distortion theory. We choose a distortion metric that reflects the effect of NN compression on the model output and then derive the tradeoff between rate (compression ratio) and distortion. In addition to characterizing theoretical limits of NN compression, this formulation shows that emph{pruning}, implicitly or explicitly, must be a part of a good compression algorithm. This observation bridges a gap between parts of the literature pertaining to NN and data compression, respectively, providing insight into the empirical success of pruning for NN compression. Finally, we propose a novel pruning strategy derived from our information-theoretic formulation and show that it outperforms the relevant baselines on CIFAR-10 and ImageNet datasets.
Previous AutoML pruning works utilized individual layer features to automatically prune filters. We analyze the correlation for two layers from different blocks which have a short-cut structure. It is found that, in one block, the deeper layer has ma
A transmitter without channel state information (CSI) wishes to send a delay-limited Gaussian source over a slowly fading channel. The source is coded in superimposed layers, with each layer successively refining the description in the previous one.
The overall predictive uncertainty of a trained predictor can be decomposed into separate contributions due to epistemic and aleatoric uncertainty. Under a Bayesian formulation, assuming a well-specified model, the two contributions can be exactly ex
End-to-end optimization capability offers neural image compression (NIC) superior lossy compression performance. However, distinct models are required to be trained to reach different points in the rate-distortion (R-D) space. In this paper, we consi
In the context of lossy compression, Blau & Michaeli (2019) adopt a mathematical notion of perceptual quality and define the information rate-distortion-perception function, generalizing the classical rate-distortion tradeoff. We consider the notion