بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Convergence rates of least squares regression estimators with heavy-tailed errors

137 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Qiyang Han

تاريخ النشر 2017

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Qiyang Han - Jon A. Wellner

نظرية الإحصاء نظرية الإحصاء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We study the performance of the Least Squares Estimator (LSE) in a general nonparametric regression model, when the errors are independent of the covariates but may only have a $p$-th moment ($pgeq 1$). In such a heavy-tailed regression setting, we show that if the model satisfies a standard `entropy condition with exponent $alpha in (0,2)$, then the $L_2$ loss of the LSE converges at a rate begin{align} mathcal{O}_{mathbf{P}}big(n^{-frac{1}{2+alpha}} vee n^{-frac{1}{2}+frac{1}{2p}}big). end{align} Such a rate cannot be improved under the entropy condition alone. This rate quantifies both some positive and negative aspects of the LSE in a heavy-tailed regression setting. On the positive side, as long as the errors have $pgeq 1+2/alpha$ moments, the $L_2$ loss of the LSE converges at the same rate as if the errors are Gaussian. On the negative side, if $p<1+2/alpha$, there are (many) hard models at any entropy level $alpha$ for which the $L_2$ loss of the LSE converges at a strictly slower rate than other robust estimators. The validity of the above rate relies crucially on the independence of the covariates and the errors. In fact, the $L_2$ loss of the LSE can converge arbitrarily slowly when the independence fails. The key technical ingredient is a new multiplier inequality that gives sharp bounds for the `multiplier empirical process associated with the LSE. We further give an application to the sparse linear regression model with heavy-tailed covariates and errors to demonstrate the scope of this new inequality.

قيم البحث

116 - Fritjof Freise , Norbert Gaffke , Rainer Schwabe 2019

The paper continues the authors work on the adaptive Wynn algorithm in a nonlinear regression model. In the present paper it is shown that if the mean response function satisfies a condition of `saturated identifiability, which was introduced by Pron zato cite{Pronzato}, then the adaptive least squares estimators are strongly consistent. The condition states that the regression parameter is identifiable under any saturated design, i.e., the values of the mean response function at any $p$ distinct design points determine the parameter point uniquely where, typically, $p$ is the dimension of the regression parameter vector. Further essential assumptions are compactness of the experimental region and of the parameter space together with some natural continuity assumptions. If the true parameter point is an interior point of the parameter space then under some smoothness assumptions and asymptotic homoscedasticity of random errors the asymptotic normality of adaptive least squares estimators is obtained.

نظرية الإحصاء نظرية الإحصاء

Asymptotic oracle properties of SCAD-penalized least squares estimators

488 - Jian Huang , Huiliang Xie 2007

We study the asymptotic properties of the SCAD-penalized least squares estimator in sparse, high-dimensional, linear regression models when the number of covariates may increase with the sample size. We are particularly interested in the use of this estimator for simultaneous variable selection and estimation. We show that under appropriate conditions, the SCAD-penalized least squares estimator is consistent for variable selection and that the estimators of nonzero coefficients have the same asymptotic distribution as they would have if the zero coefficients were known in advance. Simulation studies indicate that this estimator performs well in terms of variable selection and estimation.

نظرية الإحصاء نظرية الإحصاء

On the Asymptotic Optimality of Cross-Validation based Hyper-parameter Estimators for Regularized Least Squares Regression Problems

143 - Biqiang Mu , Tianshi Chen , Lennart Ljung 2021

The asymptotic optimality (a.o.) of various hyper-parameter estimators with different optimality criteria has been studied in the literature for regularized least squares regression problems. The estimators include e.g., the maximum (marginal) likeli hood method, $C_p$ statistics, and generalized cross validation method, and the optimality criteria are based on e.g., the inefficiency, the expectation inefficiency and the risk. In this paper, we consider the regularized least squares regression problems with fixed number of regression parameters, choose the optimality criterion based on the risk, and study the a.o. of several cross validation (CV) based hyper-parameter estimators including the leave $k$-out CV method, generalized CV method, $r$-fold CV method and hold out CV method. We find the former three methods can be a.o. under mild assumptions, but not the last one, and we use Monte Carlo simulations to illustrate the efficacy of our findings.

نظرية الإحصاء نظرية الإحصاء

Asymptotic distribution of least squares estimators for linear models with dependent errors : regular designs

377 - Emmanuel Caron , Sophie Dede 2017

In this paper, we consider the usual linear regression model in the case where the error process is assumed strictly stationary. We use a result from Hannan, who proved a Central Limit Theorem for the usual least squares estimator under general condi tions on the design and on the error process. We show that for a large class of designs, the asymptotic covariance matrix is as simple as the independent and identically distributed case. We then estimate the covariance matrix using an estimator of the spectral density whose consistency is proved under very mild conditions.

نظرية الإحصاء الاحتمالات تطبيقات الإحصاء

Connecting model-based and model-free approaches to linear least squares regression

128 - Lutz Duembgen , Laurie Davies 2018

In a regression setting with response vector $mathbf{y} in mathbb{R}^n$ and given regressor vectors $mathbf{x}_1,ldots,mathbf{x}_p in mathbb{R}^n$, a typical question is to what extent $mathbf{y}$ is related to these regressor vectors, specifically, how well can $mathbf{y}$ be approximated by a linear combination of them. Classical methods for this question are based on statistical models for the conditional distribution of $mathbf{y}$, given the regressor vectors $mathbf{x}_j$. Davies and Duembgen (2020) proposed a model-free approach in which all observation vectors $mathbf{y}$ and $mathbf{x}_j$ are viewed as fixed, and the quality of the least squares fit of $mathbf{y}$ is quantified by comparing it with the least squares fit resulting from $p$ independent white noise regressor vectors. The purpose of the present note is to explain in a general context why the model-based and model-free approach yield the same p-values, although the interpretation of the latter is different under the two paradigms.

نظرية الإحصاء نظرية الإحصاء

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة بابل

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Convergence rates of least squares regression estimators with heavy-tailed errors

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً