A Novel Representation of Neural Networks

207 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Anthony Caterini

تاريخ النشر 2016

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Anthony Caterini - Dong Eui Chang

التعلم الالي الذكاء الاصطناعي الحوسبة العصبية والتطورية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep Neural Networks (DNNs) have become very popular for prediction in many areas. Their strength is in representation with a high number of parameters that are commonly learned via gradient descent or similar optimization methods. However, the representation is non-standardized, and the gradient calculation methods are often performed using component-based approaches that break parameters down into scalar units, instead of considering the parameters as whole entities. In this work, these problems are addressed. Standard notation is used to represent DNNs in a compact framework. Gradients of DNN loss functions are calculated directly over the inner product space on which the parameters are defined. This framework is general and is applied to two common network types: the Multilayer Perceptron and the Deep Autoencoder.

قيم البحث

97 - Anthony L. Caterini , Dong Eui Chang 2016

In this paper, a geometric framework for neural networks is proposed. This framework uses the inner product space structure underlying the parameter set to perform gradient descent not in a component-based form, but in a coordinate-free manner. Convo lutional neural networks are described in this framework in a compact form, with the gradients of standard --- and higher-order --- loss functions calculated for each layer of the network. This approach can be applied to other network structures and provides a basis on which to create new networks.

التعلم الالي الذكاء الاصطناعي الحوسبة العصبية والتطورية

A Novel Adaptive Kernel for the RBF Neural Networks

65 - Shujaat Khan , Imran Naseem , Roberto Togneri 2019

In this paper, we propose a novel adaptive kernel for the radial basis function (RBF) neural networks. The proposed kernel adaptively fuses the Euclidean and cosine distance measures to exploit the reciprocating properties of the two. The proposed fr amework dynamically adapts the weights of the participating kernels using the gradient descent method thereby alleviating the need for predetermined weights. The proposed method is shown to outperform the manual fusion of the kernels on three major problems of estimation namely nonlinear system identification, pattern classification and function approximation.

التعلم الالي الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Principled Training of Neural Networks with Direct Feedback Alignment

154 - Julien Launay , Iacopo Poli , Florent Krzakala 2019

The backpropagation algorithm has long been the canonical training method for neural networks. Modern paradigms are implicitly optimized for it, and numerous guidelines exist to ensure its proper use. Recently, synthetic gradients methods -where the error gradient is only roughly approximated - have garnered interest. These methods not only better portray how biological brains are learning, but also open new computational possibilities, such as updating layers asynchronously. Even so, they have failed to scale past simple tasks like MNIST or CIFAR-10. This is in part due to a lack of standards, leading to ill-suited models and practices forbidding such methods from performing to the best of their abilities. In this work, we focus on direct feedback alignment and present a set of best practices justified by observations of the alignment angles. We characterize a bottleneck effect that prevents alignment in narrow layers, and hypothesize it may explain why feedback alignment methods have yet to scale to large convolutional networks.

التعلم الالي التعلم الآلي الحوسبة العصبية والتطورية

A Generalization of Convolutional Neural Networks to Graph-Structured Data

167 - Yotam Hechtlinger , Purvasha Chakravarti , Jining Qin 2017

This paper introduces a generalization of Convolutional Neural Networks (CNNs) from low-dimensional grid data, such as images, to graph-structured data. We propose a novel spatial convolution utilizing a random walk to uncover the relations within th e input, analogous to the way the standard convolution uses the spatial neighborhood of a pixel on the grid. The convolution has an intuitive interpretation, is efficient and scalable and can also be used on data with varying graph structure. Furthermore, this generalization can be applied to many standard regression or classification problems, by learning the the underlying graph. We empirically demonstrate the performance of the proposed CNN on MNIST, and challenge the state-of-the-art on Merck molecular activity data set.

التعلم الالي الذكاء الاصطناعي الرؤية الحاسوبية وتمييز الأنماط

Improving Deep Neural Networks with Probabilistic Maxout Units

834 - Jost Tobias Springenberg , Martin Riedmiller 2013

We present a probabilistic variant of the recently introduced maxout unit. The success of deep neural networks utilizing maxout can partly be attributed to favorable performance under dropout, when compared to rectified linear units. It however also depends on the fact that each maxout unit performs a pooling operation over a group of linear transformations and is thus partially invariant to changes in its input. Starting from this observation we ask the question: Can the desirable properties of maxout units be preserved while improving their invariance properties ? We argue that our probabilistic maxout (probout) units successfully achieve this balance. We quantitatively verify this claim and report classification performance matching or exceeding the current state of the art on three challenging image classification benchmarks (CIFAR-10, CIFAR-100 and SVHN).

التعلم الالي التعلم الآلي الحوسبة العصبية والتطورية