Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

A Sieve Stochastic Gradient Descent Estimator for Online Nonparametric Regression in Sobolev ellipsoids

119 0 0.0 ( 0 )

Download Cite

Added by Tianyu Zhang

Publication date 2021

fields Mathematical Statistics

and research's language is English

Authors Tianyu Zhang - Noah Simon

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The goal of regression is to recover an unknown underlying function that best links a set of predictors to an outcome from noisy observations. In non-parametric regression, one assumes that the regression function belongs to a pre-specified infinite dimensional function space (the hypothesis space). In the online setting, when the observations come in a stream, it is computationally-preferable to iteratively update an estimate rather than refitting an entire model repeatedly. Inspired by nonparametric sieve estimation and stochastic approximation methods, we propose a sieve stochastic gradient descent estimator (Sieve-SGD) when the hypothesis space is a Sobolev ellipsoid. We show that Sieve-SGD has rate-optimal MSE under a set of simple and direct conditions. We also show that the Sieve-SGD estimator can be constructed with low time expense, and requires almost minimal memory usage among all statistically rate-optimal estimators, under some conditions on the distribution of the predictors.

rate research

Online nonparametric regression with Sobolev kernels

77 - Oleksandr Zadorozhnyi , Pierre Gaillard , Sebastien Gerschinovitz 2021

In this work we investigate the variation of the online kernelized ridge regression algorithm in the setting of $d-$dimensional adversarial nonparametric regression. We derive the regret upper bounds on the classes of Sobolev spaces $W_{p}^{beta}(mathcal{X})$, $pgeq 2, beta>frac{d}{p}$. The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> frac{d}{2}$ or $p=infty$ these rates are (essentially) optimal. Finally, we compare the performance of the kernelized ridge regression forecaster to the known non-parametric forecasters in terms of the regret rates and their computational complexity as well as to the excess risk rates in the setting of statistical (i.i.d.) nonparametric regression.

Statistics Theory Machine Learning Machine Learning

Nonparametric Bayesian volatility estimation for gamma-driven stochastic differential equations

130 - Denis Belomestny , Shota Gugushvili , Moritz Schauer 2020

We study a nonparametric Bayesian approach to estimation of the volatility function of a stochastic differential equation driven by a gamma process. The volatility function is modelled a priori as piecewise constant, and we specify a gamma prior on its values. This leads to a straightforward procedure for posterior inference via an MCMC procedure. We give theoretical performance guarantees (contraction rates for the posterior) for the Bayesian estimate in terms of the regularity of the unknown volatility function. We illustrate the method on synthetic and real data examples.

Statistics Theory Methodology Statistics Theory

A Complete Recipe for Stochastic Gradient MCMC

377 - Yi-An Ma , Tianqi Chen , Emily B. Fox 2015

Many recent Markov chain Monte Carlo (MCMC) samplers leverage continuous dynamics to define a transition kernel that efficiently explores a target distribution. In tandem, a focus has been on devising scalable variants that subsample the data and use stochastic gradients in place of full-data gradients in the dynamic simulations. However, such stochastic gradient MCMC samplers have lagged behind their full-data counterparts in terms of the complexity of dynamics considered since proving convergence in the presence of the stochastic gradient noise is non-trivial. Even with simple dynamics, significant physical intuition is often required to modify the dynamical system to account for the stochastic gradient noise. In this paper, we provide a general recipe for constructing MCMC samplers--including stochastic gradie

Statistics Theory Methodology Machine Learning

A provable two-stage algorithm for penalized hazards regression

98 - Jianqing Fan , Wenyan Gong , Qiang Sun 2021

From an optimizers perspective, achieving the global optimum for a general nonconvex problem is often provably NP-hard using the classical worst-case analysis. In the case of Coxs proportional hazards model, by taking its statistical model structures into account, we identify local strong convexity near the global optimum, motivated by which we propose to use two convex programs to optimize the folded-concave penalized Coxs proportional hazards regression. Theoretically, we investigate the statistical and computational tradeoffs of the proposed algorithm and establish the strong oracle property of the resulting estimators. Numerical studies and real data analysis lend further support to our algorithm and theory.

Statistics Theory Methodology Statistics Theory

An Online Projection Estimator for Nonparametric Regression in Reproducing Kernel Hilbert Spaces

231 - Tianyu Zhang , Noah Simon 2021

The goal of nonparametric regression is to recover an underlying regression function from noisy observations, under the assumption that the regression function belongs to a pre-specified infinite dimensional function space. In the online setting, when the observations come in a stream, it is generally computationally infeasible to refit the whole model repeatedly. There are as of yet no methods that are both computationally efficient and statistically rate-optimal. In this paper, we propose an estimator for online nonparametric regression. Notably, our estimator is an empirical risk minimizer (ERM) in a deterministic linear space, which is quite different from existing methods using random features and functional stochastic gradient. Our theoretical analysis shows that this estimator obtains rate-optimal generalization error when the regression function is known to live in a reproducing kernel Hilbert space. We also show, theoretically and empirically, that the computational expense of our estimator is much lower than other rate-optimal estimators proposed for this online setting.

Methodology

comments

Fetching comments

National Institute of Business Administration

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

A Sieve Stochastic Gradient Descent Estimator for Online Nonparametric Regression in Sobolev ellipsoids

Ask ChatGPT about the research

No Arabic abstract

Read More