Deep Function Machines: Generalized Neural Networks for Topological Layer Expression

121 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل William Guss

تاريخ النشر 2016

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف William H. Guss

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this paper we propose a generalization of deep neural networks called deep function machines (DFMs). DFMs act on vector spaces of arbitrary (possibly infinite) dimension and we show that a family of DFMs are invariant to the dimension of input data; that is, the parameterization of the model does not directly hinge on the quality of the input (eg. high resolution images). Using this generalization we provide a new theory of universal approximation of bounded non-linear operators between function spaces. We then suggest that DFMs provide an expressive framework for designing new neural network layer types with topological considerations in mind. Finally, we introduce a novel architecture, RippLeNet, for resolution invariant computer vision, which empirically achieves state of the art invariance.

قيم البحث

107 - Samuel Ritter , David G.T. Barrett , Adam Santoro 2017

Deep neural networks (DNNs) have achieved unprecedented performance on a wide range of complex tasks, rapidly outpacing our understanding of the nature of their solutions. This has caused a recent surge of interest in methods for rendering modern neu ral systems more interpretable. In this work, we propose to address the interpretability problem in modern DNNs using the rich history of problem descriptions, theories and experimental methods developed by cognitive psychologists to study the human mind. To explore the potential value of these tools, we chose a well-established analysis from developmental psychology that explains how children learn word labels for objects, and applied that analysis to DNNs. Using datasets of stimuli inspired by the original cognitive psychology experiments, we find that state-of-the-art one shot learning models trained on ImageNet exhibit a similar bias to that observed in humans: they prefer to categorize objects according to shape rather than color. The magnitude of this shape bias varies greatly among architecturally identical, but differently seeded models, and even fluctuates within seeds throughout training, despite nearly equivalent classification performance. These results demonstrate the capability of tools from cognitive psychology for exposing hidden computational properties of DNNs, while concurrently providing us with a computational model for human word learning.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Layer Pruning for Accelerating Very Deep Neural Networks

190 - Weiwei Zhang , Changsheng chen , Xuechun Wu 2019

In this paper, we propose an adaptive pruning method. This method can cut off the channel and layer adaptively. The proportion of the layer and the channel to be cut is learned adaptively. The pruning method proposed in this paper can reduce half of the parameters, and the accuracy will not decrease or even be higher than baseline.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط

Towards Robust Deep Neural Networks

153 - Timothy E. Wang , Yiming Gu , Dhagash Mehta 2018

We investigate the topics of sensitivity and robustness in feedforward and convolutional neural networks. Combining energy landscape techniques developed in computational chemistry with tools drawn from formal methods, we produce empirical evidence i ndicating that networks corresponding to lower-lying minima in the optimization landscape of the learning objective tend to be more robust. The robustness estimate used is the inverse of a proposed sensitivity measure, which we define as the volume of an over-approximation of the reachable set of network outputs under all additive $l_{infty}$-bounded perturbations on the input data. We present a novel loss function which includes a sensitivity term in addition to the traditional task-oriented and regularization terms. In our experiments on standard machine learning and computer vision datasets, we show that the proposed loss function leads to networks which reliably optimize the robustness measure as well as other related metrics of adversarial robustness without significant degradation in the classification error. Experimental results indicate that the proposed method outperforms state-of-the-art sensitivity-based learning approaches with regards to robustness to adversarial attacks. We also show that although the introduced framework does not explicitly enforce an adversarial loss, it achieves competitive overall performance relative to methods that do.

التعلم الالي الميكانيكا الإحصائية التعلم الآلي

Retrieval Augmentation for Deep Neural Networks

62 - Rita Parada Ramos , Patricia Pereira , Helena Moniz 2021

Deep neural networks have achieved state-of-the-art results in various vision and/or language tasks. Despite the use of large training datasets, most models are trained by iterating over single input-output pairs, discarding the remaining examples fo r the current prediction. In this work, we actively exploit the training data, using the information from nearest training examples to aid the prediction both during training and testing. Specifically, our approach uses the target of the most similar training example to initialize the memory state of an LSTM model, or to guide attention mechanisms. We apply this approach to image captioning and sentiment analysis, respectively through image and text retrieval. Results confirm the effectiveness of the proposed approach for the two tasks, on the widely used Flickr8 and IMDB datasets. Our code is publicly available at http://github.com/RitaRamo/retrieval-augmentation-nn.

الحساب واللغة الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

A Novel Adaptive Kernel for the RBF Neural Networks

65 - Shujaat Khan , Imran Naseem , Roberto Togneri 2019

In this paper, we propose a novel adaptive kernel for the radial basis function (RBF) neural networks. The proposed kernel adaptively fuses the Euclidean and cosine distance measures to exploit the reciprocating properties of the two. The proposed fr amework dynamically adapts the weights of the participating kernels using the gradient descent method thereby alleviating the need for predetermined weights. The proposed method is shown to outperform the manual fusion of the kernels on three major problems of estimation namely nonlinear system identification, pattern classification and function approximation.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي