بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Two sources of poor coverage of confidence intervals after model selection

240 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Paul Kabaila

تاريخ النشر 2017

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Paul Kabaila - Rheanna Mainzer

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We compare the following two sources of poor coverage of post-model-selection confidence intervals: the preliminary data-based model selection sometimes chooses the wrong model and the data used to choose the model is re-used for the construction of the confidence interval.

قيم البحث

698 - Paul Kabaila , Khageswor Giri 2007

We consider a linear regression model, with the parameter of interest a specified linear combination of the regression parameter vector. We suppose that, as a first step, a data-based model selection (e.g. by preliminary hypothesis tests or minimizin g AIC) is used to select a model. It is common statistical practice to then construct a confidence interval for the parameter of interest based on the assumption that the selected model had been given to us a priori. This assumption is false and it can lead to a confidence interval with poor coverage properties. We provide an easily-computed finite sample upper bound (calculated by repeated numerical evaluation of a double integral) to the minimum coverage probability of this confidence interval. This bound applies for model selection by any of the following methods: minimum AIC, minimum BIC, maximum adjusted R-squared, minimum Mallows Cp and t-tests. The importance of this upper bound is that it delineates general categories of design matrices and model selection procedures for which this confidence interval has poor coverage properties. This upper bound is shown to be a finite sample analogue of an earlier large sample upper bound due to Kabaila and Leeb.

نظرية الإحصاء تطبيقات الإحصاء نظرية الإحصاء

Confidence intervals centred on bootstrap smoothed estimators: an impossibility result

87 - Paul Kabaila , Christeen Wijethunga 2019

Recently, Kabaila and Wijethunga assessed the performance of a confidence interval centred on a bootstrap smoothed estimator, with width proportional to an estimator of Efrons delta method approximation to the standard deviation of this estimator. Th ey used a testbed situation consisting of two nested linear regression models, with error variance assumed known, and model selection using a preliminary hypothesis test. This assessment was in terms of coverage and scaled expected length, where the scaling is with respect to the expected length of the usual confidence interval with the same minimum coverage probability. They found that this confidence interval has scaled expected length that (a) has a maximum value that may be much greater than 1 and (b) is greater than a number slightly less than 1 when the simpler model is correct. We therefore ask the following question. For a confidence interval, centred on the bootstrap smoothed estimator, does there exist a formula for its data-based width such that, in this testbed situation, it has the desired minimum coverage and scaled expected length that (a) has a maximum value that is not too much larger than 1 and (b) is substantially less than 1 when the simpler model is correct? Using a recent decision-theoretic performance bound due to Kabaila and Kong, it is shown that the answer to this question is `no for a wide range of scenarios.

نظرية الإحصاء المنهجية نظرية الإحصاء

Asymptotic coverage probabilities of bootstrap percentile confidence intervals for constrained parameters

102 - Chunlin Wang , Paul Marriott , Pengfei Li 2017

The asymptotic behaviour of the commonly used bootstrap percentile confidence interval is investigated when the parameters are subject to linear inequality constraints. We concentrate on the important one- and two-sample problems with data generated from general parametric distributions in the natural exponential family. The focus of this paper is on quantifying the coverage probabilities of the parametric bootstrap percentile confidence intervals, in particular their limiting behaviour near boundaries. We propose a local asymptotic framework to study this subtle coverage behaviour. Under this framework, we discover that when the true parameters are on, or close to, the restriction boundary, the asymptotic coverage probabilities can always exceed the nominal level in the one-sample case; however, they can be, remarkably, both under and over the nominal level in the two-sample case. Using illustrative examples, we show that the results provide theoretical justification and guidance on applying the bootstrap percentile method to constrained inference problems.

نظرية الإحصاء حساب المنهجية

Adaptive Confidence Sets for the Optimal Approximating Model

466 - Angelika Rohde , Lutz Duembgen 2009

In the setting of high-dimensional linear models with Gaussian noise, we investigate the possibility of confidence statements connected to model selection. Although there exist numerous procedures for adaptive point estimation, the construction of ad aptive confidence regions is severely limited (cf. Li, 1989). The present paper sheds new light on this gap. We develop exact and adaptive confidence sets for the best approximating model in terms of risk. One of our constructions is based on a multiscale procedure and a particular coupling argument. Utilizing exponential inequalities for noncentral chi-squared distributions, we show that the risk and quadratic loss of all models within our confidence region are uniformly bounded by the minimal risk times a factor close to one.

نظرية الإحصاء المنهجية نظرية الإحصاء

Confidence bands for a log-concave density

392 - Guenther Walther , Alnur Ali , Xinyue Shen 2020

We present a new approach for inference about a log-concave distribution: Instead of using the method of maximum likelihood, we propose to incorporate the log-concavity constraint in an appropriate nonparametric confidence set for the cdf $F$. This a pproach has the advantage that it automatically provides a measure of statistical uncertainty and it thus overcomes a marked limitation of the maximum likelihood estimate. In particular, we show how to construct confidence bands for the density that have a finite sample guaranteed confidence level. The nonparametric confidence set for $F$ which we introduce here has attractive computational and statistical properties: It allows to bring modern tools from optimization to bear on this problem via difference of convex programming, and it results in optimal statistical inference. We show that the width of the resulting confidence bands converges at nearly the parametric $n^{-frac{1}{2}}$ rate when the log density is $k$-affine.

نظرية الإحصاء المنهجية نظرية الإحصاء

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة حلب

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Two sources of poor coverage of confidence intervals after model selection

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

We compare the following two sources of poor coverage of post-model-selection confidence intervals: the preliminary data-based model selection sometimes chooses the wrong model and the data used to choose the model is re-used for the construction of the confidence interval.

اقرأ أيضاً