On the Decision Boundaries of Neural Networks: A Tropical Geometry Perspective

304 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Adel Bibi

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية الاحصاء الرياضي

والبحث باللغة English

تأليف Motasem Alfarra - Adel Bibi - Hasan Hammoud

التعلم الآلي التعلم الالي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This work tackles the problem of characterizing and understanding the decision boundaries of neural networks with piecewise linear non-linearity activations. We use tropical geometry, a new development in the area of algebraic geometry, to characterize the decision boundaries of a simple network of the form (Affine, ReLU, Affine). Our main finding is that the decision boundaries are a subset of a tropical hypersurface, which is intimately related to a polytope formed by the convex hull of two zonotopes. The generators of these zonotopes are functions of the network parameters. This geometric characterization provides new perspectives to three tasks. (i) We propose a new tropical perspective to the lottery ticket hypothesis, where we view the effect of different initializations on the tropical geometric representation of a networks decision boundaries. (ii) Moreover, we propose new tropical based optimization reformulations that directly influence the decision boundaries of the network for the task of network pruning. (iii) At last, we discuss the reformulation of the generation of adversarial attacks in a tropical sense. We demonstrate that one can construct adversaries in a new tropical setting by perturbing a specific set of decision boundaries by perturbing a set of parameters in the network.

قيم البحث

133 - Liwen Zhang , Gregory Naitzat , 2018

We establish, for the first time, connections between feedforward neural networks with ReLU activation and tropical geometry --- we show that the family of such neural networks is equivalent to the family of tropical rational maps. Among other things , we deduce that feedforward ReLU neural networks with one hidden layer can be characterized by zonotopes, which serve as building blocks for deeper networks; we relate decision boundaries of such neural networks to tropical hypersurfaces, a major object of study in tropical geometry; and we prove that linear regions of such neural networks correspond to vertices of polytopes associated with tropical rational functions. An insight from our tropical formulation is that a deeper network is exponentially more expressive than a shallow network.

التعلم الآلي الهندسة الجبرية التعلم الالي

On Learning Over-parameterized Neural Networks: A Functional Approximation Perspective

116 - Lili Su , Pengkun Yang 2019

We consider training over-parameterized two-layer neural networks with Rectified Linear Unit (ReLU) using gradient descent (GD) method. Inspired by a recent line of work, we study the evolutions of network prediction errors across GD iterations, whic h can be neatly described in a matrix form. When the network is sufficiently over-parameterized, these matrices individually approximate {em an} integral operator which is determined by the feature vector distribution $rho$ only. Consequently, GD method can be viewed as {em approximately} applying the powers of this integral operator on the underlying/target function $f^*$ that generates the responses/labels. We show that if $f^*$ admits a low-rank approximation with respect to the eigenspaces of this integral operator, then the empirical risk decreases to this low-rank approximation error at a linear rate which is determined by $f^*$ and $rho$ only, i.e., the rate is independent of the sample size $n$. Furthermore, if $f^*$ has zero low-rank approximation error, then, as long as the width of the neural network is $Omega(nlog n)$, the empirical risk decreases to $Theta(1/sqrt{n})$. To the best of our knowledge, this is the first result showing the sufficiency of nearly-linear network over-parameterization. We provide an application of our general results to the setting where $rho$ is the uniform distribution on the spheres and $f^*$ is a polynomial. Throughout this paper, we consider the scenario where the input dimension $d$ is fixed.

التعلم الآلي التعلم الالي

On Hiding Neural Networks Inside Neural Networks

99 - Chuan Guo , Ruihan Wu , Kilian Q. Weinberger 2020

Modern neural networks often contain significantly more parameters than the size of their training data. We show that this excess capacity provides an opportunity for embedding secret machine learning models within a trained neural network. Our novel framework hides the existence of a secret neural network with arbitrary desired functionality within a carrier network. We prove theoretically that the secret networks detection is computationally infeasible and demonstrate empirically that the carrier network does not compromise the secret networks disguise. Our paper introduces a previously unknown steganographic technique that can be exploited by adversaries if left unchecked.

التعلم الآلي التعلم الالي

Towards Interpretable Neural Networks: An Exact Transformation to Multi-Class Multivariate Decision Trees

110 - Duy T. Nguyen , Kathryn E. Kasmarik , Hussein A. Abbass 2020

Artificial neural networks (ANNs) are commonly labelled as black-boxes, lacking interpretability. This hinders human understanding of ANNs behaviors. A need exists to generate a meaningful sequential logic for the production of a specific output. Dec ision trees exhibit better interpretability and expressive power due to their representation language and the existence of efficient algorithms to generate rules. Growing a decision tree based on the available data could produce larger than necessary trees or trees that do not generalise well. In this paper, we introduce two novel multivariate decision tree (MDT) algorithms for rule extraction from an ANN: an Exact-Convertible Decision Tree (EC-DT) and an Extended C-Net algorithm to transform a neural network with Rectified Linear Unit activation functions into a representative tree which can be used to extract multivariate rules for reasoning. While the EC-DT translates the ANN in a layer-wise manner to represent exactly the decision boundaries implicitlylearned by the hidden layers of the network, the Extended C-Net inherits the decompositional approach from EC-DT and combines with a C5 tree learning algorithm to construct the decision rules. The results suggest that while EC-DT is superior in preserving the structure and the accuracy of ANN, Extended C-Net generates the most compact and highly effective trees from ANN. Both proposed MDT algorithms generate rules including combinations of multiple attributes for precise interpretation of decision-making processes.

التعلم الآلي التعلم الالي

On sample complexity of neural networks

73 - Alexander Usvyatsov 2019

We consider functions defined by deep neural networks as definable objects in an o-miminal expansion of the real field, and derive an almost linear (in the number of weights) bound on sample complexity of such networks.

التعلم الآلي التعلم الالي