Finding Input Characterizations for Output Properties in ReLU Neural Networks

78 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Saket Dingliwal

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Saket Dingliwal - Divyansh Pareek - Jatin Arora

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Deep Neural Networks (DNNs) have emerged as a powerful mechanism and are being increasingly deployed in real-world safety-critical domains. Despite the widespread success, their complex architecture makes proving any formal guarantees about them difficult. Identifying how logical notions of high-level correctness relate to the complex low-level network architecture is a significant challenge. In this project, we extend the ideas presented in and introduce a way to bridge the gap between the architecture and the high-level specifications. Our key insight is that instead of directly proving the safety properties that are required, we first prove properties that relate closely to the structure of the neural net and use them to reason about the safety properties. We build theoretical foundations for our approach, and empirically evaluate the performance through various experiments, achieving promising results than the existing approach by identifying a larger region of input space that guarantees a certain property on the output.

قيم البحث

378 - Hanie Sedghi , Anima Anandkumar 2016

We consider the problem of training input-output recurrent neural networks (RNN) for sequence labeling tasks. We propose a novel spectral approach for learning the network parameters. It is based on decomposition of the cross-moment tensor between th e output and a non-linear transformation of the input, based on score functions. We guarantee consistent learning with polynomial sample and computational complexity under transparent conditions such as non-degeneracy of model parameters, polynomial activations for the neurons, and a Markovian evolution of the input sequence. We also extend our results to Bidirectional RNN which uses both previous and future information to output the label at each time point, and is employed in many NLP tasks such as POS tagging.

التعلم الآلي الحوسبة العصبية والتطورية التعلم الالي

Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations

109 - Behnam Neyshabur , Yuhuai Wu , Ruslan Salakhutdinov 2016

We investigate the parameter-space geometry of recurrent neural networks (RNNs), and develop an adaptation of path-SGD optimization method, attuned to this geometry, that can learn plain RNNs with ReLU activations. On several datasets that require ca pturing long-term dependency structure, we show that path-SGD can significantly improve trainability of ReLU RNNs compared to RNNs trained with SGD, even with various recently suggested initialization schemes.

التعلم الآلي الحوسبة العصبية والتطورية

Finding online neural update rules by learning to remember

71 - Karol Gregor 2020

We investigate learning of the online local update rules for neural activations (bodies) and weights (synapses) from scratch. We represent the states of each weight and activation by small vectors, and parameterize their updates using (meta-) neural networks. Different neuron types are represented by different embedding vectors which allows the same two functions to be used for all neurons. Instead of training directly for the objective using evolution or long term back-propagation, as is commonly done in similar systems, we motivate and study a different objective: That of remembering past snippets of experience. We explain how this objective relates to standard back-propagation training and other forms of learning. We train for this objective using short term back-propagation and analyze the performance as a function of both the different network types and the difficulty of the problem. We find that this analysis gives interesting insights onto what constitutes a learning rule. We also discuss how such system could form a natural substrate for addressing topics such as episodic memories, meta-learning and auxiliary objectives.

التعلم الآلي الحوسبة العصبية والتطورية التعلم الالي

Equilibrium Propagation for Complete Directed Neural Networks

137 - Matilde Tristany Farinha , Sergio Pequito , Pedro A. Santos 2020

Artificial neural networks, one of the most successful approaches to supervised learning, were originally inspired by their biological counterparts. However, the most successful learning algorithm for artificial neural networks, backpropagation, is c onsidered biologically implausible. We contribute to the topic of biologically plausible neuronal learning by building upon and extending the equilibrium propagation learning framework. Specifically, we introduce: a new neuronal dynamics and learning rule for arbitrary network architectures; a sparsity-inducing method able to prune irrelevant connections; a dynamical-systems characterization of the models, using Lyapunov theory.

التعلم الآلي الحوسبة العصبية والتطورية التعلم الالي

Efficient Algorithms for Learning Depth-2 Neural Networks with General ReLU Activations

192 - Pranjal Awasthi , Alex Tang , Aravindan Vijayaraghavan 2021

We present polynomial time and sample efficient algorithms for learning an unknown depth-2 feedforward neural network with general ReLU activations, under mild non-degeneracy assumptions. In particular, we consider learning an unknown network of the form $f(x) = {a}^{mathsf{T}}sigma({W}^mathsf{T}x+b)$, where $x$ is drawn from the Gaussian distribution, and $sigma(t) := max(t,0)$ is the ReLU activation. Prior works for learning networks with ReLU activations assume that the bias $b$ is zero. In order to deal with the presence of the bias terms, our proposed algorithm consists of robustly decomposing multiple higher order tensors arising from the Hermite expansion of the function $f(x)$. Using these ideas we also establish identifiability of the network parameters under minimal assumptions.

التعلم الآلي بنى وهياكل البيانات والخوارزميات التعلم الالي