No Arabic abstract
It is possible to obtain a large Bayes Factor (BF) favoring the null hypothesis when both the null and alternative hypotheses have low likelihoods, and there are other hypotheses being ignored that are much more strongly supported by the data. As sample sizes become large it becomes increasingly probable that a strong BF favouring a point null against a conventional Bayesian vague alternative co-occurs with a BF favouring various specific alternatives against the null. For any BF threshold q and sample mean, there is a value n such that sample sizes larger than n guarantee that although the BF comparing H0 against a conventional (vague) alternative exceeds q, nevertheless for some range of hypothetical {mu}, a BF comparing H0 against {mu} in that range falls below 1/q. This paper discusses the conditions under which this conundrum occurs and investigates methods for resolving it.
There has been strong recent interest in testing interval null hypothesis for improved scientific inference. For example, Lakens et al (2018) and Lakens and Harms (2017) use this approach to study if there is a pre-specified meaningful treatment effect in gerontology and clinical trials, which is different from the more traditional point null hypothesis that tests for any treatment effect. Two popular Bayesian approaches are available for interval null hypothesis testing. One is the standard Bayes factor and the other is the Region of Practical Equivalence (ROPE) procedure championed by Kruschke and others over many years. This paper establishes a formal connection between these two approaches with two benefits. First, it helps to better understand and improve the ROPE procedure. Second, it leads to a simple and effective algorithm for computing Bayes factor in a wide range of problems using draws from posterior distributions generated by standard Bayesian programs such as BUGS, JAGS and Stan. The tedious and error-prone task of coding custom-made software specific for Bayes factor is then avoided.
Timing channels are a significant and growing security threat in computer systems, with no established solution. We have recently argued that the OS must provide time protection, in analogy to the established memory protection, to protect applications from information leakage through timing channels. Based on a recently-proposed implementation of time protection in the seL4 microkernel, we investigate how such an implementation could be formally proved to prevent timing channels. We postulate that this should be possible by reasoning about a highly abstracted representation of the shared hardware resources that cause timing channels.
Randomization (a.k.a. permutation) inference is typically interpreted as testing Fishers sharp null hypothesis that all effects are exactly zero. This hypothesis is often criticized as uninteresting and implausible. We show, however, that many randomization tests are also valid for a bounded null hypothesis under which effects are all negative (or positive) for all units but otherwise heterogeneous. The bounded null is closely related to important concepts such as monotonicity and Pareto efficiency. Inverting tests of this hypothesis yields confidence intervals for the maximum (or minimum) individual treatment effect. We then extend randomization tests to infer other quantiles of individual effects, which can be used to infer the proportion of units with effects larger (or smaller) than any threshold. The proposed confidence intervals for all quantiles of individual effects are simultaneously valid, in the sense that no correction due to multiple analyses is needed. In sum, we provide a broader justification for Fisher randomization tests, and develop exact nonparametric inference for quantiles of heterogeneous individual effects. We illustrate our methods with simulations and applications, where we find that Stephenson rank statistics often provide the most informative results.
Global null testing is a classical problem going back about a century to Fishers and Stouffers combination tests. In this work, we present simple martingale analogs of these classical tests, which are applicable in two distinct settings: (a) the online setting in which there is a possibly infinite sequence of $p$-values, and (b) the batch setting, where one uses prior knowledge to preorder the hypotheses. Through theory and simulations, we demonstrate that our martingale variants have higher power than their classical counterparts even when the preordering is only weakly informative. Finally, using a recent idea of masking $p$-values, we develop a novel interactive test for the global null that can take advantage of covariates and repeated user guidance to create a data-adaptive ordering that achieves higher detection power against structured alternatives.
A simple Bayesian approach to nonparametric regression is described using fuzzy sets and membership functions. Membership functions are interpreted as likelihood functions for the unknown regression function, so that with the help of a reference prior they can be transformed to prior density functions. The unknown regression function is decomposed into wavelets and a hierarchical Bayesian approach is employed for making inferences on the resulting wavelet coefficients.