Tighter risk certificates for neural networks

196 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Maria Perez-Ortiz

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Maria Perez-Ortiz - Omar Rivasplata - John Shawe-Taylor andn Csaba Szepesvari

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

This paper presents an empirical study regarding training probabilistic neural networks using training objectives derived from PAC-Bayes bounds. In the context of probabilistic neural networks, the output of training is a probability distribution over network weights. We present two training objectives, used here for the first time in connection with training neural networks. These two training objectives are derived from tight PAC-Bayes bounds. We also re-implement a previously used training objective based on a classical PAC-Bayes bound, to compare the properties of the predictors learned using the different training objectives. We compute risk certificates that are valid on any unseen examples for the learnt predictors. We further experiment with different types of priors on the weights (both data-free and data-dependent priors) and neural network architectures. Our experiments on MNIST and CIFAR-10 show that our training methods produce competitive test set errors and non-vacuous risk bounds with much tighter values than previous results in the literature, showing promise not only to guide the learning algorithm through bounding the risk but also for model selection. These observations suggest that the methods studied here might be good candidates for self-certified learning, in the sense of certifying the risk on any unseen data without the need for data-splitting protocols.

قيم البحث

91 - Christian Sprecher , Marc Fischer , Dimitar I. Dimitrov 2021

Existing neural network verifiers compute a proof that each input is handled correctly under a given perturbation by propagating a convex set of reachable values at each layer. This process is repeated independently for each input (e.g., image) and p erturbation (e.g., rotation), leading to an expensive overall proof effort when handling an entire dataset. In this work we introduce a new method for reducing this verification cost based on the key insight that convex sets obtained at intermediate layers can overlap across different inputs and perturbations. Leveraging this insight, we introduce the general concept of shared certificates, enabling proof effort reuse across multiple inputs and driving down overall verification costs. We validate our insight via an extensive experimental evaluation and demonstrate the effectiveness of shared certificates on a range of datasets and attack specifications including geometric, patch and $ell_infty$ input perturbations.

التعلم الآلي التشفير والأمن

Temporally Folded Convolutional Neural Networks for Sequence Forecasting

183 - Matthias Weissenbacher 2020

In this work we propose a novel approach to utilize convolutional neural networks for time series forecasting. The time direction of the sequential data with spatial dimensions $D=1,2$ is considered democratically as the input of a spatiotemporal $(D +1)$-dimensional convolutional neural network. Latter then reduces the data stream from $D +1 to D$ dimensions followed by an incriminator cell which uses this information to forecast the subsequent time step. We empirically compare this strategy to convolutional LSTMs and LSTMs on their performance on the sequential MNIST and the JSB chorals dataset, respectively. We conclude that temporally folded convolutional neural networks (TFCs) may outperform the conventional recurrent strategies.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Effective Version Space Reduction for Convolutional Neural Networks

123 - Jiayu Liu , Ioannis Chiotellis , Rudolph Triebel 2020

In active learning, sampling bias could pose a serious inconsistency problem and hinder the algorithm from finding the optimal hypothesis. However, many methods for neural networks are hypothesis space agnostic and do not address this problem. We exa mine active learning with convolutional neural networks through the principled lens of version space reduction. We identify the connection between two approaches---prior mass reduction and diameter reduction---and propose a new diameter-based querying method---the minimum Gibbs-vote disagreement. By estimating version space diameter and bias, we illustrate how version space of neural networks evolves and examine the realizability assumption. With experiments on MNIST, Fashion-MNIST, SVHN and STL-10 datasets, we demonstrate that diameter reduction methods reduce the version space more effectively and perform better than prior mass reduction and other baselines, and that the Gibbs vote disagreement is on par with the best query method.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Deep Polynomial Neural Networks

101 - Grigorios Chrysos , Stylianos Moschoglou , Giorgos Bouritsas 2020

Deep Convolutional Neural Networks (DCNNs) are currently the method of choice both for generative, as well as for discriminative learning in computer vision and machine learning. The success of DCNNs can be attributed to the careful selection of thei r building blocks (e.g., residual blocks, rectifiers, sophisticated normalization schemes, to mention but a few). In this paper, we propose $Pi$-Nets, a new class of function approximators based on polynomial expansions. $Pi$-Nets are polynomial neural networks, i.e., the output is a high-order polynomial of the input. The unknown parameters, which are naturally represented by high-order tensors, are estimated through a collective tensor factorization with factors sharing. We introduce three tensor decompositions that significantly reduce the number of parameters and show how they can be efficiently implemented by hierarchical neural networks. We empirically demonstrate that $Pi$-Nets are very expressive and they even produce good results without the use of non-linear activation functions in a large battery of tasks and signals, i.e., images, graphs, and audio. When used in conjunction with activation functions, $Pi$-Nets produce state-of-the-art results in three challenging tasks, i.e. image generation, face verification and 3D mesh representation learning. The source code is available at url{https://github.com/grigorisg9gr/polynomial_nets}.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي

Statistical Guarantees for the Robustness of Bayesian Neural Networks

357 - Luca Cardelli , Marta Kwiatkowska , Luca Laurenti 2019

We introduce a probabilistic robustness measure for Bayesian Neural Networks (BNNs), defined as the probability that, given a test point, there exists a point within a bounded set such that the BNN prediction differs between the two. Such a measure c an be used, for instance, to quantify the probability of the existence of adversarial examples. Building on statistical verification techniques for probabilistic models, we develop a framework that allows us to estimate probabilistic robustness for a BNN with statistical guarantees, i.e., with a priori error and confidence bounds. We provide experimental comparison for several approximate BNN inference techniques on image classification tasks associated to MNIST and a two-class subset of the GTSRB dataset. Our results enable quantification of uncertainty of BNN predictions in adversarial settings.

التعلم الآلي الرؤية الحاسوبية وتمييز الأنماط التعلم الالي