FuncNN: An R Package to Fit Deep Neural Networks Using Generalized Input Spaces

97 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Barinder Thind

تاريخ النشر 2020

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Barinder Thind - Sidi Wu - Richard Groenewald

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Neural networks have excelled at regression and classification problems when the input space consists of scalar variables. As a result of this proficiency, several popular packages have been developed that allow users to easily fit these kinds of models. However, the methodology has excluded the use of functional covariates and to date, there exists no software that allows users to build deep learning models with this generalized input space. To the best of our knowledge, the functional neural network (FuncNN) library is the first such package in any programming language; the library has been developed for R and is built on top of the keras architecture. Throughout this paper, several functions are introduced that provide users an avenue to easily build models, generate predictions, and run cross-validations. A summary of the underlying methodology is also presented. The ultimate contribution is a package that provides a set of general modelling and diagnostic tools for data problems in which there exist both functional and scalar covariates.

قيم البحث

80 - Florian Wenzel , Kevin Roth , Bastiaan S. Veeling 2020

During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress a nd the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neural networks in industrial practice. In this work we cast doubt on the current understanding of Bayes posteriors in popular deep neural networks: we demonstrate through careful MCMC sampling that the posterior predictive induced by the Bayes posterior yields systematically worse predictions compared to simpler methods including point estimates obtained from SGD. Furthermore, we demonstrate that predictive performance is improved significantly through the use of a cold posterior that overcounts evidence. Such cold posteriors sharply deviate from the Bayesian paradigm but are commonly used as heuristic in Bayesian deep learning papers. We put forward several hypotheses that could explain cold posteriors and evaluate the hypotheses through experiments. Our work questions the goal of accurate posterior approximations in Bayesian deep learning: If the true Bayes posterior is poor, what is the use of more accurate approximations? Instead, we argue that it is timely to focus on understanding the origin of the improved performance of cold posteriors.

التعلم الالي التعلم الآلي حساب

Characterization of the Variation Spaces Corresponding to Shallow Neural Networks

168 - Jonathan W. Siegel , Jinchao Xu 2021

We consider the variation space corresponding to a dictionary of functions in $L^2(Omega)$ and present the basic theory of approximation in these spaces. Specifically, we compare the definition based on integral representations with the definition in terms of convex hulls. We show that in many cases, including the dictionaries corresponding to shallow ReLU$^k$ networks and a dictionary of decaying Fourier modes, that the two definitions coincide. We also give a partial characterization of the variation space for shallow ReLU$^k$ networks and show that the variation space with respect to the dictionary of decaying Fourier modes corresponds to the Barron spectral space.

التعلم الالي التعلم الآلي

A Bayesian Approach to Invariant Deep Neural Networks

188 - Nikolaos Mourdoukoutas , Marco Federici , Georges Pantalos 2021

We propose a novel Bayesian neural network architecture that can learn invariances from data alone by inferring a posterior distribution over different weight-sharing schemes. We show that our model outperforms other non-invariant architectures, when trained on datasets that contain specific invariances. The same holds true when no data augmentation is performed.

التعلم الالي التعلم الآلي

Time-to-event regression using partially monotonic neural networks

80 - David Rindt , Robert Hu , David Steinsaltz 2021

We propose a novel method, termed SuMo-net, that uses partially monotonic neural networks to learn a time-to-event distribution from a sample of covariates and right-censored times. SuMo-net models the survival function and the density jointly, and o ptimizes the likelihood for right-censored data instead of the often used partial likelihood. The method does not make assumptions about the true survival distribution and avoids computationally expensive integration of the hazard function. We evaluate the performance of the method on a range of datasets and find competitive performance across different metrics and improved computational time of making new predictions.

التعلم الالي التعلم الآلي

Deep Neural Networks as Gaussian Processes

100 - Jaehoon Lee , Yasaman Bahri , Roman Novak 2017

It has long been known that a single-layer fully-connected neural network with an i.i.d. prior over its parameters is equivalent to a Gaussian process (GP), in the limit of infinite network width. This correspondence enables exact Bayesian inference for infinite width neural networks on regression tasks by means of evaluating the corresponding GP. Recently, kernel functions which mimic multi-layer random neural networks have been developed, but only outside of a Bayesian framework. As such, previous work has not identified that these kernels can be used as covariance functions for GPs and allow fully Bayesian prediction with a deep neural network. In this work, we derive the exact equivalence between infinitely wide deep networks and GPs. We further develop a computationally efficient pipeline to compute the covariance function for these GPs. We then use the resulting GPs to perform Bayesian inference for wide deep neural networks on MNIST and CIFAR-10. We observe that trained neural network accuracy approaches that of the corresponding GP with increasing layer width, and that the GP uncertainty is strongly correlated with trained network prediction error. We further find that test performance increases as finite-width trained networks are made wider and more similar to a GP, and thus that GP predictions typically outperform those of finite-width networks. Finally we connect the performance of these GPs to the recent theory of signal propagation in random neural networks.

التعلم الالي التعلم الآلي