ﻻ يوجد ملخص باللغة العربية
Deep neural networks (DNNs) have successfully learned useful data representations in various tasks, however, assessing the reliability of these representations remains a challenge. Deep Ensemble is widely considered the state-of-the-art method for uncertainty estimation, but it is very expensive to train and test. MC-Dropout is another alternative method, which is less expensive but lacks the diversity of predictions. To get more diverse predictions in less time, we introduce Randomized ReLU Activation (RRA) framework. Under the framework, we propose two strategies, MC-DropReLU and MC-RReLU, to estimate uncertainty. Instead of randomly dropping some neurons of the network as in MC-Dropout, the RRA framework adds randomness to the activation function module, making the outputs diverse. As far as we know, this is the first attempt to add randomness to the activation function module to generate predictive uncertainty. We analyze and compare the output diversity of MC-Dropout and our method from the variance perspective and obtain the relationship between the hyperparameters and output diversity in the two methods. Moreover, our method is simple to implement and does not need to modify the existing model. We experimentally validate the RRA framework on three widely used datasets, CIFAR10, CIFAR100, and TinyImageNet. The experiments demonstrate that our method has competitive performance but is more favorable in training time and memory requirements.
We consider the problem of uncertainty estimation in the context of (non-Bayesian) deep neural classification. In this context, all known methods are based on extracting uncertainty signals from a trained network optimized to solve the classification
We explore convergence of deep neural networks with the popular ReLU activation function, as the depth of the networks tends to infinity. To this end, we introduce the notion of activation domains and activation matrices of a ReLU network. By replaci
We study the expressive power of deep ReLU neural networks for approximating functions in dilated shift-invariant spaces, which are widely used in signal processing, image processing, communications and so on. Approximation error bounds are estimated
It has been widely assumed that a neural network cannot be recovered from its outputs, as the network depends on its parameters in a highly nonlinear way. Here, we prove that in fact it is often possible to identify the architecture, weights, and bia
The training of artificial neural networks (ANNs) with rectified linear unit (ReLU) activation via gradient descent (GD) type optimization schemes is nowadays a common industrially relevant procedure. Till this day in the scientific literature there