ﻻ يوجد ملخص باللغة العربية
We consider the problem of choosing between several models in least-squares regression with heteroscedastic data. We prove that any penalization procedure is suboptimal when the penalty is a function of the dimension of the model, at least for some typical heteroscedastic model selection problems. In particular, Mallows Cp is suboptimal in this framework. On the contrary, optimal model selection is possible with data-driven penalties such as resampling or $V$-fold penalties. Therefore, it is worth estimating the shape of the penalty from data, even at the price of a higher computational cost. Simulation experiments illustrate the existence of a trade-off between statistical accuracy and computational complexity. As a conclusion, we sketch some rules for choosing a penalty in least-squares regression, depending on what is known about possible variations of the noise-level.
We study a dimensionality reduction technique for finite mixtures of high-dimensional multivariate response regression models. Both the dimension of the response and the number of predictors are allowed to exceed the sample size. We consider predicto
This paper discusses asymptotic distributions of various estimators of the underlying parameters in some regression models with long memory (LM) Gaussian design and nonparametric heteroscedastic LM moving average errors. In the simple linear regressi
The dual problem of testing the predictive significance of a particular covariate, and identification of the set of relevant covariates is common in applied research and methodological investigations. To study this problem in the context of functiona
We study the problem of high-dimensional variable selection via some two-step procedures. First we show that given some good initial estimator which is $ell_{infty}$-consistent but not necessarily variable selection consistent, we can apply the nonne
We consider a $l_1$-penalization procedure in the non-parametric Gaussian regression model. In many concrete examples, the dimension $d$ of the input variable $X$ is very large (sometimes depending on the number of observations). Estimation of a $bet