Do Neural Network Weights account for Classes Centers?

563 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ioannis Kansizoglou

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ioannis Kansizoglou - Loukas Bampis -

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The exploitation of Deep Neural Networks (DNNs) as descriptors in feature learning challenges enjoys apparent popularity over the past few years. The above tendency focuses on the development of effective loss functions that ensure both high feature discrimination among different classes, as well as low geodesic distance between the feature vectors of a given class. The vast majority of the contemporary works rely their formulation on an empirical assumption about the feature space of a networks last hidden layer, claiming that the weight vector of a class accounts for its geometrical center in the studied space. The paper at hand follows a theoretical approach and indicates that the aforementioned hypothesis is not exclusively met. This fact raises stability issues regarding the training procedure of a DNN, as shown in our experimental study. Consequently, a specific symmetry is proposed and studied both analytically and empirically that satisfies the above assumption, addressing the established convergence issues.

قيم البحث

140 - Caglar Aytekin , Francesco Cricri , Emre Aksu 2019

In this paper we apply a compressibility loss that enables learning highly compressible neural network weights. The loss was previously proposed as a measure of negated sparsity of a signal, yet in this paper we show that minimizing this loss also en forces the non-zero parts of the signal to have very low entropy, thus making the entire signal more compressible. For an optimization problem where the goal is to minimize the compressibility loss (the objective), we prove that at any critical point of the objective, the weight vector is a ternary signal and the corresponding value of the objective is the squared root of the number of non-zero elements in the signal, thus directly related to sparsity. In the experiments, we train neural networks with the compressibility loss and we show that the proposed method achieves weight sparsity and compression ratios comparable with the state-of-the-art.

التعلم الآلي التعلم الالي

ReCU: Reviving the Dead Weights in Binary Neural Networks

85 - Zihan Xu , Mingbao Lin , Jianzhuang Liu 2021

Binary neural networks (BNNs) have received increasing attention due to their superior reductions of computation and memory. Most existing works focus on either lessening the quantization error by minimizing the gap between the full-precision weights and their binarization or designing a gradient approximation to mitigate the gradient mismatch, while leaving the dead weights untouched. This leads to slow convergence when training BNNs. In this paper, for the first time, we explore the influence of dead weights which refer to a group of weights that are barely updated during the training of BNNs, and then introduce rectified clamp unit (ReCU) to revive the dead weights for updating. We prove that reviving the dead weights by ReCU can result in a smaller quantization error. Besides, we also take into account the information entropy of the weights, and then mathematically analyze why the weight standardization can benefit BNNs. We demonstrate the inherent contradiction between minimizing the quantization error and maximizing the information entropy, and then propose an adaptive exponential scheduler to identify the range of the dead weights. By considering the dead weights, our method offers not only faster BNN training, but also state-of-the-art performance on CIFAR-10 and ImageNet, compared with recent methods. Code can be available at https://github.com/z-hXu/ReCU.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

Exponential discretization of weights of neural network connections in pre-trained neural networks

90 - Magomed Yu. Malsagov , Emil M. Khayrov , Maria M. Pushkareva 2020

To reduce random access memory (RAM) requirements and to increase speed of recognition algorithms we consider a weight discretization problem for trained neural networks. We show that an exponential discretization is preferable to a linear discretiza tion since it allows one to achieve the same accuracy when the number of bits is 1 or 2 less. The quality of the neural network VGG-16 is already satisfactory (top5 accuracy 69%) in the case of 3 bit exponential discretization. The ResNet50 neural network shows top5 accuracy 84% at 4 bits. Other neural networks perform fairly well at 5 bits (top5 accuracies of Xception, Inception-v3, and MobileNet-v2 top5 were 87%, 90%, and 77%, respectively). At less number of bits, the accuracy decreases rapidly.

الحوسبة العصبية والتطورية الرؤية الحاسوبية وتمييز الأنماط

Importance Estimation for Neural Network Pruning

166 - Pavlo Molchanov , Arun Mallya , Stephen Tyree 2019

Structural pruning of neural network parameters reduces computation, energy, and memory transfer costs during inference. We propose a novel method that estimates the contribution of a neuron (filter) to the final loss and iteratively removes those wi th smaller scores. We describe two variations of our method using the first and second-order Taylor expansions to approximate a filters contribution. Both methods scale consistently across any network layer without requiring per-layer sensitivity analysis and can be applied to any kind of layer, including skip connections. For modern networks trained on ImageNet, we measured experimentally a high (>93%) correlation between the contribution computed by our methods and a reliable estimate of the true importance. Pruning with the proposed methods leads to an improvement over state-of-the-art in terms of accuracy, FLOPs, and parameter reduction. On ResNet-101, we achieve a 40% FLOPS reduction by removing 30% of the parameters, with a loss of 0.02% in the top-1 accuracy on ImageNet. Code is available at https://github.com/NVlabs/Taylor_pruning.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Confounding Tradeoffs for Neural Network Quantization

123 - Sahaj Garg , Anirudh Jain , Joe Lou 2021

Many neural network quantization techniques have been developed to decrease the computational and memory footprint of deep learning. However, these methods are evaluated subject to confounding tradeoffs that may affect inference acceleration or resou rce complexity in exchange for higher accuracy. In this work, we articulate a variety of tradeoffs whose impact is often overlooked and empirically analyze their impact on uniform and mixed-precision post-training quantization, finding that these confounding tradeoffs may have a larger impact on quantized network accuracy than the actual quantization methods themselves. Because these tradeoffs constrain the attainable hardware acceleration for different use-cases, we encourage researchers to explicitly report these design choices through the structure of quantization cards. We expect quantization cards to help researchers compare methods more effectively and engineers determine the applicability of quantization techniques for their hardware.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط