ترغب بنشر مسار تعليمي؟ اضغط هنا

On the usage of randomized p-values in the Schweder-Spjotvoll estimator

376   0   0.0 ( 0 )
 نشر من قبل Thorsten Dickhaus
 تاريخ النشر 2020
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

We are concerned with multiple test problems with composite null hypotheses and the estimation of the proportion $pi_{0}$ of true null hypotheses. The Schweder-Spjo tvoll estimator $hat{pi}_0$ utilizes marginal $p$-values and only works properly if the $p$-values that correspond to the true null hypotheses are uniformly distributed on $[0,1]$ ($mathrm{Uni}[0,1]$-distributed). In the case of composite null hypotheses, marginal $p$-values are usually computed under least favorable parameter configurations (LFCs). Thus, they are stochastically larger than $mathrm{Uni}[0,1]$ under non-LFCs in the null hypotheses. When using these LFC-based $p$-values, $hat{pi}_0$ tends to overestimate $pi_{0}$. We introduce a new way of randomizing $p$-values that depends on a tuning parameter $cin[0,1]$, such that $c=0$ and $c=1$ lead to $mathrm{Uni}[0,1]$-distributed $p$-values, which are independent of the data, and to the original LFC-based $p$-values, respectively. For a certain value $c=c^{star}$ the bias of $hat{pi}_0$ is minimized when using our randomized $p$-values. This often also entails a smaller mean squared error of the estimator as compared to the usage of the LFC-based $p$-values. We analyze these points theoretically, and we demonstrate them numerically in computer simulations under various standard statistical models.



قيم البحث

اقرأ أيضاً

We are concerned with testing replicability hypotheses for many endpoints simultaneously. This constitutes a multiple test problem with composite null hypotheses. Traditional $p$-values, which are computed under least favourable parameter configurati ons, are over-conservative in the case of composite null hypotheses. As demonstrated in prior work, this poses severe challenges in the multiple testing context, especially when one goal of the statistical analysis is to estimate the proportion $pi_0$ of true null hypotheses. Randomized $p$-values have been proposed to remedy this issue. In the present work, we discuss the application of randomized $p$-values in replicability analysis. In particular, we introduce a general class of statistical models for which valid, randomized $p$-values can be calculated easily. By means of computer simulations, we demonstrate that their usage typically leads to a much more accurate estimation of $pi_0$. Finally, we apply our proposed methodology to a real data example from genomics.
Many statistical methods have been proposed for variable selection in the past century, but few balance inference and prediction tasks well. Here we report on a novel variable selection approach called Penalized regression with Second-Generation P-Va lues (ProSGPV). It captures the true model at the best rate achieved by current standards, is easy to implement in practice, and often yields the smallest parameter estimation error. The idea is to use an l0 penalization scheme with second-generation p-values (SGPV), instead of traditional ones, to determine which variables remain in a model. The approach yields tangible advantages for balancing support recovery, parameter estimation, and prediction tasks. The ProSGPV algorithm can maintain its good performance even when there is strong collinearity among features or when a high dimensional feature space with p > n is considered. We present extensive simulations and a real-world application comparing the ProSGPV approach with smoothly clipped absolute deviation (SCAD), adaptive lasso (AL), and mini-max concave penalty with penalized linear unbiased selection (MC+). While the last three algorithms are among the current standards for variable selection, ProSGPV has superior inference performance and comparable prediction performance in certain scenarios. Supplementary materials are available online.
Verifying that a statistically significant result is scientifically meaningful is not only good scientific practice, it is a natural way to control the Type I error rate. Here we introduce a novel extension of the p-value - a second-generation p-valu e - that formally accounts for scientific relevance and leverages this natural Type I Error control. The approach relies on a pre-specified interval null hypothesis that represents the collection of effect sizes that are scientifically uninteresting or are practically null. The second-generation p-value is the proportion of data-supported hypotheses that are also null hypotheses. As such, second-generation p-values indicate when the data are compatible with null hypotheses, or with alternative hypotheses, or when the data are inconclusive. Moreover, second-generation p-values provide a proper scientific adjustment for multiple comparisons and reduce false discovery rates. This is an advance for environments rich in data, where traditional p-value adjustments are needlessly punitive. Second-generation p-values promote transparency, rigor and reproducibility of scientific results by a priori specifying which candidate hypotheses are practically meaningful and by providing a more reliable statistical summary of when the data are compatible with alternative or null hypotheses.
In the multivariate setting, defining extremal risk measures is important in many contexts, such as finance, environmental planning and structural engineering. In this paper, we review the literature on extremal bivariate return curves, a risk measur e that is the natural bivariate extension to a return level, and propose new estimation methods based on multivariate extreme value models that can account for both asymptotic dependence and asymptotic independence. We identify gaps in the existing literature and propose novel tools for testing and validating return curves and comparing estimates from a range of multivariate models. These tools are then used to compare a selection of models through simulation and case studies. We conclude with a discussion and list some of the challenges.
We approach the problem of combining top-ranking association statistics or P-value from a new perspective which leads to a remarkably simple and powerful method. Statistical methods, such as the Rank Truncated Product (RTP), have been developed for c ombining top-ranking associations and this general strategy proved to be useful in applications for detecting combined effects of multiple disease components. To increase power, these methods aggregate signals across top ranking SNPs, while adjusting for their total number assessed in a study. Analytic expressions for combined top statistics or P-values tend to be unwieldy, which complicates interpretation, practical implementation, and hinders further developments. Here, we propose the Augmented Rank Truncation (ART) method that retains main characteristics of the RTP but is substantially simpler to implement. ART leads to an efficient form of the adaptive algorithm, an approach where the number of top ranking SNPs is varied to optimize power. We illustrate our methods by strengthening previously reported associations of $mu$-opioid receptor variants with sensitivity to pain.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا