Cognitive Psychology for Deep Neural Networks: A Shape Bias Case Study

108 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل David Barrett

تاريخ النشر 2017

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Samuel Ritter - David G.T. Barrett - Adam Santoro

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep neural networks (DNNs) have achieved unprecedented performance on a wide range of complex tasks, rapidly outpacing our understanding of the nature of their solutions. This has caused a recent surge of interest in methods for rendering modern neural systems more interpretable. In this work, we propose to address the interpretability problem in modern DNNs using the rich history of problem descriptions, theories and experimental methods developed by cognitive psychologists to study the human mind. To explore the potential value of these tools, we chose a well-established analysis from developmental psychology that explains how children learn word labels for objects, and applied that analysis to DNNs. Using datasets of stimuli inspired by the original cognitive psychology experiments, we find that state-of-the-art one shot learning models trained on ImageNet exhibit a similar bias to that observed in humans: they prefer to categorize objects according to shape rather than color. The magnitude of this shape bias varies greatly among architecturally identical, but differently seeded models, and even fluctuates within seeds throughout training, despite nearly equivalent classification performance. These results demonstrate the capability of tools from cognitive psychology for exposing hidden computational properties of DNNs, while concurrently providing us with a computational model for human word learning.

قيم البحث

120 - William H. Guss 2016

In this paper we propose a generalization of deep neural networks called deep function machines (DFMs). DFMs act on vector spaces of arbitrary (possibly infinite) dimension and we show that a family of DFMs are invariant to the dimension of input dat a; that is, the parameterization of the model does not directly hinge on the quality of the input (eg. high resolution images). Using this generalization we provide a new theory of universal approximation of bounded non-linear operators between function spaces. We then suggest that DFMs provide an expressive framework for designing new neural network layer types with topological considerations in mind. Finally, we introduce a novel architecture, RippLeNet, for resolution invariant computer vision, which empirically achieves state of the art invariance.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Gaussian Process Deep Belief Networks: A Smooth Generative Model of Shape with Uncertainty Propagation

82 - Alessandro Di Martino , Erik Bodin , Carl Henrik Ek 2018

The shape of an object is an important characteristic for many vision problems such as segmentation, detection and tracking. Being independent of appearance, it is possible to generalize to a large range of objects from only small amounts of data. Ho wever, shapes represented as silhouette images are challenging to model due to complicated likelihood functions leading to intractable posteriors. In this paper we present a generative model of shapes which provides a low dimensional latent encoding which importantly resides on a smooth manifold with respect to the silhouette images. The proposed model propagates uncertainty in a principled manner allowing it to learn from small amounts of data and providing predictions with associated uncertainty. We provide experiments that show how our proposed model provides favorable quantitative results compared with the state-of-the-art while simultaneously providing a representation that resides on a low-dimensional interpretable manifold.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

A Novel Adaptive Kernel for the RBF Neural Networks

65 - Shujaat Khan , Imran Naseem , Roberto Togneri 2019

In this paper, we propose a novel adaptive kernel for the radial basis function (RBF) neural networks. The proposed kernel adaptively fuses the Euclidean and cosine distance measures to exploit the reciprocating properties of the two. The proposed fr amework dynamically adapts the weights of the participating kernels using the gradient descent method thereby alleviating the need for predetermined weights. The proposed method is shown to outperform the manual fusion of the kernels on three major problems of estimation namely nonlinear system identification, pattern classification and function approximation.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

A Forward-Backward Approach for Visualizing Information Flow in Deep Networks

116 - Aditya Balu , Thanh V. Nguyen , Apurva Kokate 2017

We introduce a new, systematic framework for visualizing information flow in deep networks. Specifically, given any trained deep convolutional network model and a given test image, our method produces a compact support in the image domain that corres ponds to a (high-resolution) feature that contributes to the given explanation. Our method is both computationally efficient as well as numerically robust. We present several preliminary numerical results that support the benefits of our framework over existing methods.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Towards Robust Deep Neural Networks

153 - Timothy E. Wang , Yiming Gu , Dhagash Mehta 2018

We investigate the topics of sensitivity and robustness in feedforward and convolutional neural networks. Combining energy landscape techniques developed in computational chemistry with tools drawn from formal methods, we produce empirical evidence i ndicating that networks corresponding to lower-lying minima in the optimization landscape of the learning objective tend to be more robust. The robustness estimate used is the inverse of a proposed sensitivity measure, which we define as the volume of an over-approximation of the reachable set of network outputs under all additive $l_{infty}$-bounded perturbations on the input data. We present a novel loss function which includes a sensitivity term in addition to the traditional task-oriented and regularization terms. In our experiments on standard machine learning and computer vision datasets, we show that the proposed loss function leads to networks which reliably optimize the robustness measure as well as other related metrics of adversarial robustness without significant degradation in the classification error. Experimental results indicate that the proposed method outperforms state-of-the-art sensitivity-based learning approaches with regards to robustness to adversarial attacks. We also show that although the introduced framework does not explicitly enforce an adversarial loss, it achieves competitive overall performance relative to methods that do.

التعلم الالي الميكانيكا الإحصائية التعلم الآلي