ﻻ يوجد ملخص باللغة العربية
We consider a linear regression model, with the parameter of interest a specified linear combination of the regression parameter vector. We suppose that, as a first step, a data-based model selection (e.g. by preliminary hypothesis tests or minimizing AIC) is used to select a model. It is common statistical practice to then construct a confidence interval for the parameter of interest based on the assumption that the selected model had been given to us a priori. This assumption is false and it can lead to a confidence interval with poor coverage properties. We provide an easily-computed finite sample upper bound (calculated by repeated numerical evaluation of a double integral) to the minimum coverage probability of this confidence interval. This bound applies for model selection by any of the following methods: minimum AIC, minimum BIC, maximum adjusted R-squared, minimum Mallows Cp and t-tests. The importance of this upper bound is that it delineates general categories of design matrices and model selection procedures for which this confidence interval has poor coverage properties. This upper bound is shown to be a finite sample analogue of an earlier large sample upper bound due to Kabaila and Leeb.
We compare the following two sources of poor coverage of post-model-selection confidence intervals: the preliminary data-based model selection sometimes chooses the wrong model and the data used to choose the model is re-used for the construction of the confidence interval.
We derive a computationally convenient formula for the large sample coverage probability of a confidence interval for a scalar parameter of interest following a preliminary hypothesis test that a specified vector parameter takes a given value in a ge
The asymptotic behaviour of the commonly used bootstrap percentile confidence interval is investigated when the parameters are subject to linear inequality constraints. We concentrate on the important one- and two-sample problems with data generated
Recently, Kabaila and Wijethunga assessed the performance of a confidence interval centred on a bootstrap smoothed estimator, with width proportional to an estimator of Efrons delta method approximation to the standard deviation of this estimator. Th
We consider regression in which one predicts a response $Y$ with a set of predictors $X$ across different experiments or environments. This is a common setup in many data-driven scientific fields and we argue that statistical inference can benefit fr