No Arabic abstract
Bootstrap smoothed (bagged) estimators have been proposed as an improvement on estimators found after preliminary data-based model selection. Efron, 2014, derived a widely applicable formula for a delta method approximation to the standard deviation of the bootstrap smoothed estimator. He also considered a confidence interval centered on the bootstrap smoothed estimator, with width proportional to the estimate of this standard deviation. Kabaila and Wijethunga, 2019, assessed the performance of this confidence interval in the scenario of two nested linear regression models, the full model and a simpler model, for the case of known error variance and preliminary model selection using a hypothesis test. They found that the performance of this confidence interval was not substantially better than the usual confidence interval based on the full model, with the same minimum coverage. We extend this assessment to the case of unknown error variance by deriving a computationally convenient exact formula for the ideal (i.e. in the limit as the number of bootstrap replications diverges to infinity) delta method approximation to the standard deviation of the bootstrap smoothed estimator. Our results show that, unlike the known error variance case, there are circumstances in which this confidence interval has attractive properties.
Bootstrap smoothed (bagged) parameter estimators have been proposed as an improvement on estimators found after preliminary data-based model selection. The key result of Efron (2014) is a very convenient and widely applicable formula for a delta method approximation to the standard deviation of the bootstrap smoothed estimator. This approximation provides an easily computed guide to the accuracy of this estimator. In addition, Efron (2014) proposed a confidence interval centered on the bootstrap smoothed estimator, with width proportional to the estimate of this approximation to the standard deviation. We evaluate this confidence interval in the scenario of two nested linear regression models, the full model and a simpler model, and a preliminary test of the null hypothesis that the simpler model is correct. We derive computationally convenient expressions for the ideal bootstrap smoothed estimator and the coverage probability and expected length of this confidence interval. In terms of coverage probability, this confidence interval outperforms the post-model-selection confidence interval with the same nominal coverage and based on the same preliminary test. We also compare the performance of confidence interval centered on the bootstrap smoothed estimator, in terms of expected length, to the usual confidence interval, with the same minimum coverage probablility, based on the full model.
Recently, Kabaila and Wijethunga assessed the performance of a confidence interval centred on a bootstrap smoothed estimator, with width proportional to an estimator of Efrons delta method approximation to the standard deviation of this estimator. They used a testbed situation consisting of two nested linear regression models, with error variance assumed known, and model selection using a preliminary hypothesis test. This assessment was in terms of coverage and scaled expected length, where the scaling is with respect to the expected length of the usual confidence interval with the same minimum coverage probability. They found that this confidence interval has scaled expected length that (a) has a maximum value that may be much greater than 1 and (b) is greater than a number slightly less than 1 when the simpler model is correct. We therefore ask the following question. For a confidence interval, centred on the bootstrap smoothed estimator, does there exist a formula for its data-based width such that, in this testbed situation, it has the desired minimum coverage and scaled expected length that (a) has a maximum value that is not too much larger than 1 and (b) is substantially less than 1 when the simpler model is correct? Using a recent decision-theoretic performance bound due to Kabaila and Kong, it is shown that the answer to this question is `no for a wide range of scenarios.
This study aims to evaluate the performance of power in the likelihood ratio test for changepoint detection by bootstrap sampling, and proposes a hypothesis test based on bootstrapped confidence interval lengths. Assuming i.i.d normally distributed errors, and using the bootstrap method, the changepoint sampling distribution is estimated. Furthermore, this study describes a method to estimate a data set with no changepoint to form the null sampling distribution. With the null sampling distribution, and the distribution of the estimated changepoint, critical values and power calculations can be made, over the lengths of confidence intervals.
Introductory texts on statistics typically only cover the classical two sigma confidence interval for the mean value and do not describe methods to obtain confidence intervals for other estimators. The present technical report fills this gap by first defining different methods for the construction of confidence intervals, and then by their application to a binomial proportion, the mean value, and to arbitrary estimators. Beside the frequentist approach, the likelihood ratio and the highest posterior density approach are explained. Two methods to estimate the variance of general maximum likelihood estimators are described (Hessian, Jackknife), and for arbitrary estimators the bootstrap is suggested. For three examples, the different methods are evaluated by means of Monte Carlo simulations with respect to their coverage probability and interval length. R code is given for all methods, and the practitioner obtains a guideline which method should be used in which cases.
Consider a linear regression model and suppose that our aim is to find a confidence interval for a specified linear combination of the regression parameters. In practice, it is common to perform a Durbin-Watson pretest of the null hypothesis of zero first-order autocorrelation of the random errors against the alternative hypothesis of positive first-order autocorrelation. If this null hypothesis is accepted then the confidence interval centred on the Ordinary Least Squares estimator is used; otherwise the confidence interval centred on the Feasible Generalized Least Squares estimator is used. We provide new tools for the computation, for any given design matrix and parameter of interest, of graphs of the coverage probability functions of the confidence interval resulting from this two-stage procedure and the confidence interval that is always centred on the Feasible Generalized Least Squares estimator. These graphs are used to choose the better confidence interval, prior to any examination of the observed response vector.