Online nonparametric regression with Sobolev kernels

78 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Oleksandr Zadorozhnyi

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Oleksandr Zadorozhnyi - Pierre Gaillard - Sebastien Gerschinovitz

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

In this work we investigate the variation of the online kernelized ridge regression algorithm in the setting of $d-$dimensional adversarial nonparametric regression. We derive the regret upper bounds on the classes of Sobolev spaces $W_{p}^{beta}(mathcal{X})$, $pgeq 2, beta>frac{d}{p}$. The upper bounds are supported by the minimax regret analysis, which reveals that in the cases $beta> frac{d}{2}$ or $p=infty$ these rates are (essentially) optimal. Finally, we compare the performance of the kernelized ridge regression forecaster to the known non-parametric forecasters in terms of the regret rates and their computational complexity as well as to the excess risk rates in the setting of statistical (i.i.d.) nonparametric regression.

قيم البحث

118 - Tianyu Zhang , Noah Simon 2021

The goal of regression is to recover an unknown underlying function that best links a set of predictors to an outcome from noisy observations. In non-parametric regression, one assumes that the regression function belongs to a pre-specified infinite dimensional function space (the hypothesis space). In the online setting, when the observations come in a stream, it is computationally-preferable to iteratively update an estimate rather than refitting an entire model repeatedly. Inspired by nonparametric sieve estimation and stochastic approximation methods, we propose a sieve stochastic gradient descent estimator (Sieve-SGD) when the hypothesis space is a Sobolev ellipsoid. We show that Sieve-SGD has rate-optimal MSE under a set of simple and direct conditions. We also show that the Sieve-SGD estimator can be constructed with low time expense, and requires almost minimal memory usage among all statistically rate-optimal estimators, under some conditions on the distribution of the predictors.

نظرية الإحصاء المنهجية نظرية الإحصاء

Strongly universally consistent nonparametric regression and classification with privatised data

120 - Thomas Berrett , Laszlo Gyorfi , Harro Walk 2020

In this paper we revisit the classical problem of nonparametric regression, but impose local differential privacy constraints. Under such constraints, the raw data $(X_1,Y_1),ldots,(X_n,Y_n)$, taking values in $mathbb{R}^d times mathbb{R}$, cannot be directly observed, and all estimators are functions of the randomised output from a suitable privacy mechanism. The statistician is free to choose the form of the privacy mechanism, and here we add Laplace distributed noise to a discretisation of the location of a feature vector $X_i$ and to the value of its response variable $Y_i$. Based on this randomised data, we design a novel estimator of the regression function, which can be viewed as a privatised version of the well-studied partitioning regression estimator. The main result is that the estimator is strongly universally consistent. Our methods and analysis also give rise to a strongly universally consistent binary classification rule for locally differentially private data.

نظرية الإحصاء المنهجية التعلم الالي

Robust Nonparametric Regression with Deep Neural Networks

133 - Guohao Shen , Yuling Jiao , Yuanyuan Lin 2021

In this paper, we study the properties of robust nonparametric estimation using deep neural networks for regression models with heavy tailed error distributions. We establish the non-asymptotic error bounds for a class of robust nonparametric regress ion estimators using deep neural networks with ReLU activation under suitable smoothness conditions on the regression function and mild conditions on the error term. In particular, we only assume that the error distribution has a finite p-th moment with p greater than one. We also show that the deep robust regression estimators are able to circumvent the curse of dimensionality when the distribution of the predictor is supported on an approximate lower-dimensional set. An important feature of our error bound is that, for ReLU neural networks with network width and network size (number of parameters) no more than the order of the square of the dimensionality d of the predictor, our excess risk bounds depend sub-linearly on d. Our assumption relaxes the exact manifold support assumption, which could be restrictive and unrealistic in practice. We also relax several crucial assumptions on the data distribution, the target regression function and the neural networks required in the recent literature. Our simulation studies demonstrate that the robust methods can significantly outperform the least squares method when the errors have heavy-tailed distributions and illustrate that the choice of loss function is important in the context of deep nonparametric regression.

نظرية الإحصاء نظرية الإحصاء

Online Asynchronous Distributed Regression

149 - Gerard Biau (LSTA , LPMA , DMA 2014

Distributed computing offers a high degree of flexibility to accommodate modern learning constraints and the ever increasing size of datasets involved in massive data issues. Drawing inspiration from the theory of distributed computation models devel oped in the context of gradient-type optimization algorithms, we present a consensus-based asynchronous distributed approach for nonparametric online regression and analyze some of its asymptotic properties. Substantial numerical evidence involving up to 28 parallel processors is provided on synthetic datasets to assess the excellent performance of our method, both in terms of computation time and prediction accuracy.

نظرية الإحصاء التعلم الالي نظرية الإحصاء

Nonparametric risk bounds for time-series forecasting

137 - Daniel J. McDonald , Cosma Rohilla Shalizi , Mark Schervish 2012

We derive generalization error bounds for traditional time-series forecasting models. Our results hold for many standard forecasting tools including autoregressive models, moving average models, and, more generally, linear state-space models. These n on-asymptotic bounds need only weak assumptions on the data-generating process, yet allow forecasters to select among competing models and to guarantee, with high probability, that their chosen model will perform well. We motivate our techniques with and apply them to standard economic and financial forecasting tools---a GARCH model for predicting equity volatility and a dynamic stochastic general equilibrium model (DSGE), the standard tool in macroeconomic forecasting. We demonstrate in particular how our techniques can aid forecasters and policy makers in choosing models which behave well under uncertainty and mis-specification.

نظرية الإحصاء التعلم الآلي التعلم الالي