Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Adaptive Variational Bayesian Inference for Sparse Deep Neural Network

72 0 0.0 ( 0 )

Download Cite

Added by Guang Cheng

Publication date 2019

fields Mathematical Statistics

and research's language is English

Authors Jincheng Bai - Qifan Song - Guang Cheng

Statistics Theory Statistics Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this work, we focus on variational Bayesian inference on the sparse Deep Neural Network (DNN) modeled under a class of spike-and-slab priors. Given a pre-specified sparse DNN structure, the corresponding variational posterior contraction rate is characterized that reveals a trade-off between the variational error and the approximation error, which are both determined by the network structural complexity (i.e., depth, width and sparsity). However, the optimal network structure, which strikes the balance of the aforementioned trade-off and yields the best rate, is generally unknown in reality. Therefore, our work further develops an {em adaptive} variational inference procedure that can automatically select a reasonably good (data-dependent) network structure that achieves the best contraction rate, without knowing the optimal network structure. In particular, when the true function is H{o}lder smooth, the adaptive variational inference is capable to attain (near-)optimal rate without the knowledge of smoothness level. The above rate still suffers from the curse of dimensionality, and thus motivates the teacher-student setup, i.e., the true function is a sparse DNN model, under which the rate only logarithmically depends on the input dimension.

rate research

Bayesian inference for nonlinear inverse problems

181 - Vladimir Spokoiny 2019

Bayesian methods are actively used for parameter identification and uncertainty quantification when solving nonlinear inverse problems with random noise. However, there are only few theoretical results justifying the Bayesian approach. Recent papers, see e.g. cite{Nickl2017,lu2017bernsteinvon} and references therein, illustrate the main difficulties and challenges in studying the properties of the posterior distribution in the nonparametric setup. This paper offers a new approach for study the frequentist properties of the nonparametric Bayes procedures. The idea of the approach is to relax the nonlinear structural equation by introducing an auxiliary functional parameter and replacing the structural equation with a penalty and by imposing a prior on the auxiliary parameter. For the such extended model, we state sharp bounds on posterior concentration and on the accuracy of the penalized MLE and on Gaussian approximation of the posterior, and a number of further results. All the bounds are given in terms of effective dimension, and we show that the proposed calming device does not significantly affect this value.

Statistics Theory Statistics Theory

Estimating the Rate Constant from Biosensor Data via an Adaptive Variational Bayesian Approach

63 - Y. Zhang , Z. Yao , P. Forssen 2019

The means to obtain the rate constants of a chemical reaction is a fundamental open problem in both science and the industry. Traditional techniques for finding rate constants require either chemical modifications of the reactants or indirect measurements. The rate constant map method is a modern technique to study binding equilibrium and kinetics in chemical reactions. Finding a rate constant map from biosensor data is an ill-posed inverse problem that is usually solved by regularization. In this work, rather than finding a deterministic regularized rate constant map that does not provide uncertainty quantification of the solution, we develop an adaptive variational Bayesian approach to estimate the distribution of the rate constant map, from which some intrinsic properties of a chemical reaction can be explored, including information about rate constants. Our new approach is more realistic than the existing approaches used for biosensors and allows us to estimate the dynamics of the interactions, which are usually hidden in a deterministic approximate solution. We verify the performance of the new proposed method by numerical simulations, and compare it with the Markov chain Monte Carlo algorithm. The results illustrate that the variational method can reliably capture the posterior distribution in a computationally efficient way. Finally, the developed method is also tested on the real biosensor data (parathyroid hormone), where we provide two novel analysis tools~-- the thresholding contour map and the high order moment map -- to estimate the number of interactions as well as their rate constants.

Statistics Theory Statistics Theory

On Bayesian based adaptive confidence sets for linear functionals

323 - Botond Szabo 2014

We consider the problem of constructing Bayesian based confidence sets for linear functionals in the inverse Gaussian white noise model. We work with a scale of Gaussian priors indexed by a regularity hyper-parameter and apply the data-driven (slightly modified) marginal likelihood empirical Bayes method for the choice of this hyper-parameter. We show by theory and simulations that the credible sets constructed by this method have sub-optimal behaviour in general. However, by assuming self-similarity the credible sets have rate-adaptive size and optimal coverage. As an application of these results we construct $L_{infty}$-credible bands for the true functional parameter with adaptive size and optimal coverage under self-similarity constraint.

Statistics Theory Statistics Theory

Cytometry inference through adaptive atomic deconvolution

88 - Manon Costa 2017

In this paper we consider a statistical estimation problem known as atomic deconvolution. Introduced in reliability, this model has a direct application when considering biological data produced by flow cytometers. In these experiments, biologists measure the fluorescence emission of treated cells and compare them with their natural emission to study the presence of specific molecules on the cells surface. They observe a signal which is composed of a noise (the natural fluorescence) plus some additional signal related to the quantity of molecule present on the surface if any. From a statistical point of view, we aim at inferring the percentage of cells expressing the selected molecule and the probability distribution function associated with its fluorescence emission. We propose here an adap-tive estimation procedure based on a previous deconvolution procedure introduced by [vEGS08, GvES11]. For both estimating the mixing parameter and the mixing density automatically, we use the Lepskii method based on the optimal choice of a bandwidth using a bias-variance decomposition. We then derive some concentration inequalities for our estimators and obtain the convergence rates, that are shown to be minimax optimal (up to some log terms) in Sobolev classes. Finally, we apply our algorithm on simulated and real biological data.

Statistics Theory Statistics Theory

Statistical inference in sparse high-dimensional additive models

110 - Karl Gregory , Enno Mammen , Martin Wahl 2016

In this paper we discuss the estimation of a nonparametric component $f_1$ of a nonparametric additive model $Y=f_1(X_1) + ...+ f_q(X_q) + epsilon$. We allow the number $q$ of additive components to grow to infinity and we make sparsity assumptions about the number of nonzero additive components. We compare this estimation problem with that of estimating $f_1$ in the oracle model $Z= f_1(X_1) + epsilon$, for which the additive components $f_2,dots,f_q$ are known. We construct a two-step presmoothing-and-resmoothing estimator of $f_1$ and state finite-sample bounds for the difference between our estimator and some smoothing estimators $hat f_1^{text{(oracle)}}$ in the oracle model. In an asymptotic setting these bounds can be used to show asymptotic equivalence of our estimator and the oracle estimators; the paper thus shows that, asymptotically, under strong enough sparsity conditions, knowledge of $f_2,dots,f_q$ has no effect on estimation accuracy. Our first step is to estimate $f_1$ with an undersmoothed estimator based on near-orthogonal projections with a group Lasso bias correction. We then construct pseudo responses $hat Y$ by evaluating a debiased modification of our undersmoothed estimator of $f_1$ at the design points. In the second step the smoothing method of the oracle estimator $hat f_1^{text{(oracle)}}$ is applied to a nonparametric regression problem with responses $hat Y$ and covariates $X_1$. Our mathematical exposition centers primarily on establishing properties of the presmoothing estimator. We present simulation results demonstrating close-to-oracle performance of our estimator in practical applications.

Statistics Theory Statistics Theory

comments

Fetching comments

Higher Institute for Applied Sciences and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Adaptive Variational Bayesian Inference for Sparse Deep Neural Network

Ask ChatGPT about the research

No Arabic abstract

Read More