Infinite-dimensional Folded-in-time Deep Neural Networks

108 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Florian Stelzer

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Florian Stelzer

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The method recently introduced in arXiv:2011.10115 realizes a deep neural network with just a single nonlinear element and delayed feedback. It is applicable for the description of physically implemented neural networks. In this work, we present an infinite-dimensional generalization, which allows for a more rigorous mathematical analysis and a higher flexibility in choosing the weight functions. Precisely speaking, the weights are described by Lebesgue integrable functions instead of step functions. We also provide a functional back-propagation algorithm, which enables gradient descent training of the weights. In addition, with a slight modification, our concept realizes recurrent neural networks.

قيم البحث

98 - Florian Stelzer 2020

Deep neural networks are among the most widely applied machine learning tools showing outstanding performance in a broad range of tasks. We present a method for folding a deep neural network of arbitrary size into a single neuron with multiple time-d elayed feedback loops. This single-neuron deep neural network comprises only a single nonlinearity and appropriately adjusted modulations of the feedback signals. The network states emerge in time as a temporal unfolding of the neurons dynamics. By adjusting the feedback-modulation within the loops, we adapt the networks connection weights. These connection weights are determined via a back-propagation algorithm, where both the delay-induced and local network connections must be taken into account. Our approach can fully represent standard Deep Neural Networks (DNN), encompasses sparse DNNs, and extends the DNN concept toward dynamical systems implementations. The new method, which we call Folded-in-time DNN (Fit-DNN), exhibits promising performance in a set of benchmark tasks.

التعلم الآلي الحوسبة العصبية والتطورية

On the Expressive Power of Deep Polynomial Neural Networks

348 - Joe Kileel , Matthew Trager , Joan Bruna 2019

We study deep neural networks with polynomial activations, particularly their expressive power. For a fixed architecture and activation degree, a polynomial neural network defines an algebraic map from weights to polynomials. The image of this map is the functional space associated to the network, and it is an irreducible algebraic variety upon taking closure. This paper proposes the dimension of this variety as a precise measure of the expressive power of polynomial neural networks. We obtain several theoretical results regarding this dimension as a function of architecture, including an exact formula for high activation degrees, as well as upper and lower bounds on layer widths in order for deep polynomials networks to fill the ambient functional space. We also present computational evidence that it is profitable in terms of expressiveness for layer widths to increase monotonically and then decrease monotonically. Finally, we link our study to favorable optimization properties when training weights, and we draw intriguing connections with tensor and polynomial decompositions.

التعلم الآلي الحوسبة العصبية والتطورية الهندسة الجبرية

Training Deep Convolutional Neural Networks with Resistive Cross-Point Devices

120 - Tayfun Gokmen , O. Murat Onen , Wilfried Haensch 2017

In a previous work we have detailed the requirements to obtain a maximal performance benefit by implementing fully connected deep neural networks (DNN) in form of arrays of resistive devices for deep learning. This concept of Resistive Processing Uni t (RPU) devices we extend here towards convolutional neural networks (CNNs). We show how to map the convolutional layers to RPU arrays such that the parallelism of the hardware can be fully utilized in all three cycles of the backpropagation algorithm. We find that the noise and bound limitations imposed due to analog nature of the computations performed on the arrays effect the training accuracy of the CNNs. Noise and bound management techniques are presented that mitigate these problems without introducing any additional complexity in the analog circuits and can be addressed by the digital circuits. In addition, we discuss digitally programmable update management and device variability reduction techniques that can be used selectively for some of the layers in a CNN. We show that combination of all those techniques enables a successful application of the RPU concept for training CNNs. The techniques discussed here are more general and can be applied beyond CNN architectures and therefore enables applicability of RPU approach for large class of neural network architectures.

التعلم الآلي الحوسبة العصبية والتطورية التعلم الالي

Compression strategies and space-conscious representations for deep neural networks

405 - Giosu`e Cataldo Marin`o , Gregorio Ghidoli , Marco Frasca 2020

Recent advances in deep learning have made available large, powerful convolutional neural networks (CNN) with state-of-the-art performance in several real-world applications. Unfortunately, these large-sized models have millions of parameters, thus t hey are not deployable on resource-limited platforms (e.g. where RAM is limited). Compression of CNNs thereby becomes a critical problem to achieve memory-efficient and possibly computationally faster model representations. In this paper, we investigate the impact of lossy compression of CNNs by weight pruning and quantization, and lossless weight matrix representations based on source coding. We tested several combinations of these techniques on four benchmark datasets for classification and regression problems, achieving compression rates up to $165$ times, while preserving or improving the model performance.

التعلم الآلي الحوسبة العصبية والتطورية التعلم الالي

Temporally Folded Convolutional Neural Networks for Sequence Forecasting

183 - Matthias Weissenbacher 2020

In this work we propose a novel approach to utilize convolutional neural networks for time series forecasting. The time direction of the sequential data with spatial dimensions $D=1,2$ is considered democratically as the input of a spatiotemporal $(D +1)$-dimensional convolutional neural network. Latter then reduces the data stream from $D +1 to D$ dimensions followed by an incriminator cell which uses this information to forecast the subsequent time step. We empirically compare this strategy to convolutional LSTMs and LSTMs on their performance on the sequential MNIST and the JSB chorals dataset, respectively. We conclude that temporally folded convolutional neural networks (TFCs) may outperform the conventional recurrent strategies.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي