ﻻ يوجد ملخص باللغة العربية
In this paper we apply a compressibility loss that enables learning highly compressible neural network weights. The loss was previously proposed as a measure of negated sparsity of a signal, yet in this paper we show that minimizing this loss also enforces the non-zero parts of the signal to have very low entropy, thus making the entire signal more compressible. For an optimization problem where the goal is to minimize the compressibility loss (the objective), we prove that at any critical point of the objective, the weight vector is a ternary signal and the corresponding value of the objective is the squared root of the number of non-zero elements in the signal, thus directly related to sparsity. In the experiments, we train neural networks with the compressibility loss and we show that the proposed method achieves weight sparsity and compression ratios comparable with the state-of-the-art.
Neural networks enjoy widespread use, but many aspects of their training, representation, and operation are poorly understood. In particular, our view into the training process is limited, with a single scalar loss being the most common viewport into
The exploitation of Deep Neural Networks (DNNs) as descriptors in feature learning challenges enjoys apparent popularity over the past few years. The above tendency focuses on the development of effective loss functions that ensure both high feature
Formal verification of neural networks is essential for their deployment in safety-critical areas. Many available formal verification methods have been shown to be instances of a unified Branch and Bound (BaB) formulation. We propose a novel framewor
As machine learning systems get widely adopted for high-stake decisions, quantifying uncertainty over predictions becomes crucial. While modern neural networks are making remarkable gains in terms of predictive accuracy, characterizing uncertainty ov
The record-breaking performance of deep neural networks (DNNs) comes with heavy parameterization, leading to external dynamic random-access memory (DRAM) for storage. The prohibitive energy of DRAM accesses makes it non-trivial to deploy DNN on resou