Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Connecting model-based and model-free approaches to linear least squares regression

129 0 0.0 ( 0 )

Download Cite

Added by Lutz Duembgen

Publication date 2018

fields Mathematical Statistics

and research's language is English

Authors Lutz Duembgen - Laurie Davies

Statistics Theory Statistics Theory

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In a regression setting with response vector $mathbf{y} in mathbb{R}^n$ and given regressor vectors $mathbf{x}_1,ldots,mathbf{x}_p in mathbb{R}^n$, a typical question is to what extent $mathbf{y}$ is related to these regressor vectors, specifically, how well can $mathbf{y}$ be approximated by a linear combination of them. Classical methods for this question are based on statistical models for the conditional distribution of $mathbf{y}$, given the regressor vectors $mathbf{x}_j$. Davies and Duembgen (2020) proposed a model-free approach in which all observation vectors $mathbf{y}$ and $mathbf{x}_j$ are viewed as fixed, and the quality of the least squares fit of $mathbf{y}$ is quantified by comparing it with the least squares fit resulting from $p$ independent white noise regressor vectors. The purpose of the present note is to explain in a general context why the model-based and model-free approach yield the same p-values, although the interpretation of the latter is different under the two paradigms.

rate research

Linear regression under model uncertainty

89 - Shuzhen Yang , Jianfeng Yao 2021

We reexamine the classical linear regression model when the model is subject to two types of uncertainty: (i) some of covariates are either missing or completely inaccessible, and (ii) the variance of the measurement error is undetermined and changing according to a mechanism unknown to the statistician. By following the recent theory of sublinear expectation, we propose to characterize such mean and variance uncertainty in the response variable by two specific nonlinear random variables, which encompass an infinite family of probability distributions for the response variable in the sense of (linear) classical probability theory. The approach enables a family of estimators under various loss functions for the regression parameter and the parameters related to model uncertainty. The consistency of the estimators is established under mild conditions on the data generation process. Three applications are introduced to assess the quality of the approach including a forecasting model for the S&P Index.

Statistics Theory Statistics Theory

Convergence rates of least squares regression estimators with heavy-tailed errors

136 - Qiyang Han , Jon A. Wellner 2017

We study the performance of the Least Squares Estimator (LSE) in a general nonparametric regression model, when the errors are independent of the covariates but may only have a $p$-th moment ($pgeq 1$). In such a heavy-tailed regression setting, we show that if the model satisfies a standard `entropy condition with exponent $alpha in (0,2)$, then the $L_2$ loss of the LSE converges at a rate begin{align*} mathcal{O}_{mathbf{P}}big(n^{-frac{1}{2+alpha}} vee n^{-frac{1}{2}+frac{1}{2p}}big). end{align*} Such a rate cannot be improved under the entropy condition alone. This rate quantifies both some positive and negative aspects of the LSE in a heavy-tailed regression setting. On the positive side, as long as the errors have $pgeq 1+2/alpha$ moments, the $L_2$ loss of the LSE converges at the same rate as if the errors are Gaussian. On the negative side, if $p<1+2/alpha$, there are (many) hard models at any entropy level $alpha$ for which the $L_2$ loss of the LSE converges at a strictly slower rate than other robust estimators. The validity of the above rate relies crucially on the independence of the covariates and the errors. In fact, the $L_2$ loss of the LSE can converge arbitrarily slowly when the independence fails. The key technical ingredient is a new multiplier inequality that gives sharp bounds for the `multiplier empirical process associated with the LSE. We further give an application to the sparse linear regression model with heavy-tailed covariates and errors to demonstrate the scope of this new inequality.

Statistics Theory Statistics Theory

On the Asymptotic Optimality of Cross-Validation based Hyper-parameter Estimators for Regularized Least Squares Regression Problems

143 - Biqiang Mu , Tianshi Chen , Lennart Ljung 2021

The asymptotic optimality (a.o.) of various hyper-parameter estimators with different optimality criteria has been studied in the literature for regularized least squares regression problems. The estimators include e.g., the maximum (marginal) likelihood method, $C_p$ statistics, and generalized cross validation method, and the optimality criteria are based on e.g., the inefficiency, the expectation inefficiency and the risk. In this paper, we consider the regularized least squares regression problems with fixed number of regression parameters, choose the optimality criterion based on the risk, and study the a.o. of several cross validation (CV) based hyper-parameter estimators including the leave $k$-out CV method, generalized CV method, $r$-fold CV method and hold out CV method. We find the former three methods can be a.o. under mild assumptions, but not the last one, and we use Monte Carlo simulations to illustrate the efficacy of our findings.

Statistics Theory Statistics Theory

Least Squares Estimator for Vasicek Model Driven by Sub-fractional Brownian Processes from Discrete Observations

87 - Cuiyun Zhang , Jingjun Guo , Aiqin Ma 2020

We study the parameter estimation problem of Vasicek Model driven by sub-fractional Brownian processes from discrete observations, and let {S_t^H,t>=0} denote a sub-fractional Brownian motion whose Hurst parameter 1/2<H<1 . The studies are as follows: firstly, two unknown parameters in the model are estimated by the least squares method. Secondly, the strong consistency and the asymptotic distribution of the estimators are studied respectively. Finally, our estimators are validated by numerical simulation.

Statistics Theory Statistics Theory

Refined Least Squares for Support Recovery

123 - Ofir Lindenbaum , Stefan Steinerberger 2021

We study the problem of exact support recovery based on noisy observations and present Refined Least Squares (RLS). Given a set of noisy measurement $$ myvec{y} = myvec{X}myvec{theta}^* + myvec{omega},$$ and $myvec{X} in mathbb{R}^{N times D}$ which is a (known) Gaussian matrix and $myvec{omega} in mathbb{R}^N$ is an (unknown) Gaussian noise vector, our goal is to recover the support of the (unknown) sparse vector $myvec{theta}^* in left{-1,0,1right}^D$. To recover the support of the $myvec{theta}^*$ we use an average of multiple least squares solutions, each computed based on a subset of the full set of equations. The support is estimated by identifying the most significant coefficients of the average least squares solution. We demonstrate that in a wide variety of settings our method outperforms state-of-the-art support recovery algorithms.

Statistics Theory Statistics Theory

comments

Fetching comments

Cordoba Private University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Connecting model-based and model-free approaches to linear least squares regression

Ask ChatGPT about the research

No Arabic abstract

Read More