ﻻ يوجد ملخص باللغة العربية
Deep Neural Networks (DNNs) have become increasingly popular in computer vision, natural language processing, and other areas. However, training and fine-tuning a deep learning model is computationally intensive and time-consuming. We propose a new method to improve the performance of nearly every model including pre-trained models. The proposed method uses an ensemble approach where the networks in the ensemble are constructed by reassigning model parameter values based on the probabilistic distribution of these parameters, calculated towards the end of the training process. For pre-trained models, this approach results in an additional training step (usually less than one epoch). We perform a variety of analysis using the MNIST dataset and validate the approach with a number of DNN models using pre-trained models on the ImageNet dataset.
Compressing Deep Neural Network (DNN) models to alleviate the storage and computation requirements is essential for practical applications, especially for resource limited devices. Although capable of reducing a reasonable amount of model parameters,
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent and thus has a slow convergence. In addition, softmax, as a decision layer, may ignore the distribution information of the data during classification.
Over-parameterization of neural networks benefits the optimization and generalization yet brings cost in practice. Pruning is adopted as a post-processing solution to this problem, which aims to remove unnecessary parameters in a neural network with
Ensembles of deep neural networks have achieved great success recently, but they do not offer a proper Bayesian justification. Moreover, while they allow for averaging of predictions over several hypotheses, they do not provide any guarantees for the
Many available formal verification methods have been shown to be instances of a unified Branch-and-Bound (BaB) formulation. We propose a novel machine learning framework that can be used for designing an effective branching strategy as well as for co