ترغب بنشر مسار تعليمي؟ اضغط هنا

Comparing distributions by multiple testing across quantiles or CDF values

58   0   0.0 ( 0 )
 نشر من قبل David Kaplan
 تاريخ النشر 2017
  مجال البحث اقتصاد
والبحث باللغة English




اسأل ChatGPT حول البحث

When comparing two distributions, it is often helpful to learn at which quantiles or values there is a statistically significant difference. This provides more information than the binary reject or do not reject decision of a global goodness-of-fit test. Framing our question as multiple testing across the continuum of quantiles $tauin(0,1)$ or values $rinmathbb{R}$, we show that the Kolmogorov--Smirnov test (interpreted as a multiple testing procedure) achieves strong control of the familywise error rate. However, its well-known flaw of low sensitivity in the tails remains. We provide an alternative method that retains such strong control of familywise error rate while also having even sensitivity, i.e., equal pointwise type I error rates at each of $ntoinfty$ order statistics across the distribution. Our one-sample method computes instantly, using our new formula that also instantly computes goodness-of-fit $p$-values and uniform confidence bands. To improve power, we also propose stepdown and pre-test procedures that maintain control of the asymptotic familywise error rate. One-sample and two-sample cases are considered, as well as extensions to regression discontinuity designs and conditional distributions. Simulations, empirical examples, and code are provided.



قيم البحث

اقرأ أيضاً

173 - Xueying Tang , Ke Li , Malay Ghosh 2015
This paper considers Bayesian multiple testing under sparsity for polynomial-tailed distributions satisfying a monotone likelihood ratio property. Included in this class of distributions are the Students t, the Pareto, and many other distributions. W e prove some general asymptotic optimality results under fixed and random thresholding. As examples of these general results, we establish the Bayesian asymptotic optimality of several multiple testing procedures in the literature for appropriately chosen false discovery rate levels. We also show by simulation that the Benjamini-Hochberg procedure with a false discovery rate level different from the asymptotically optimal one can lead to high Bayes risk.
We study the approximation of arbitrary distributions $P$ on $d$-dimensional space by distributions with log-concave density. Approximation means minimizing a Kullback--Leibler-type functional. We show that such an approximation exists if and only if $P$ has finite first moments and is not supported by some hyperplane. Furthermore we show that this approximation depends continuously on $P$ with respect to Mallows distance $D_1(cdot,cdot)$. This result implies consistency of the maximum likelihood estimator of a log-concave density under fairly general conditions. It also allows us to prove existence and consistency of estimators in regression models with a response $Y=mu(X)+epsilon$, where $X$ and $epsilon$ are independent, $mu(cdot)$ belongs to a certain class of regression functions while $epsilon$ is a random error with log-concave density and mean zero.
Many popular methods for building confidence intervals on causal effects under high-dimensional confounding require strong ultra-sparsity assumptions that may be difficult to validate in practice. To alleviate this difficulty, we here study a new met hod for average treatment effect estimation that yields asymptotically exact confidence intervals assuming that either the conditional response surface or the conditional probability of treatment allows for an ultra-sparse representation (but not necessarily both). This guarantee allows us to provide valid inference for average treatment effect in high dimensions under considerably more generality than available baselines. In addition, we showcase that our results are semi-parametrically efficient.
We analyze the combination of multiple predictive distributions for time series data when all forecasts are misspecified. We show that a specific dynamic form of Bayesian predictive synthesis -- a general and coherent Bayesian framework for ensemble methods -- produces exact minimax predictive densities with regard to Kullback-Leibler loss, providing theoretical support for finite sample predictive performance over existing ensemble methods. A simulation study that highlights this theoretical result is presented, showing that dynamic Bayesian predictive synthesis is superior to other ensemble methods using multiple metrics.
We develop new higher-order asymptotic techniques for the Gaussian maximum likelihood estimator in a spatial panel data model, with fixed effects, time-varying covariates, and spatially correlated errors. Our saddlepoint density and tail area approxi mation feature relative error of order $O(1/(n(T-1)))$ with $n$ being the cross-sectional dimension and $T$ the time-series dimension. The main theoretical tool is the tilted-Edgeworth technique in a non-identically distributed setting. The density approximation is always non-negative, does not need resampling, and is accurate in the tails. Monte Carlo experiments on density approximation and testing in the presence of nuisance parameters illustrate the good performance of our approximation over first-order asymptotics and Edgeworth expansions. An empirical application to the investment-saving relationship in OECD (Organisation for Economic Co-operation and Development) countries shows disagreement between testing results based on first-order asymptotics and saddlepoint techniques.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا