Connecting Bayes factor and the Region of Practical Equivalence (ROPE) Procedure for testing interval null hypothesis

58 0 0.0 ( 0 )

Download Cite

Added by Arthur Berg

Publication date 2019

fields Mathematical Statistics

and research's language is English

Authors J. G. Liao - Vishal Midya -

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

There has been strong recent interest in testing interval null hypothesis for improved scientific inference. For example, Lakens et al (2018) and Lakens and Harms (2017) use this approach to study if there is a pre-specified meaningful treatment effect in gerontology and clinical trials, which is different from the more traditional point null hypothesis that tests for any treatment effect. Two popular Bayesian approaches are available for interval null hypothesis testing. One is the standard Bayes factor and the other is the Region of Practical Equivalence (ROPE) procedure championed by Kruschke and others over many years. This paper establishes a formal connection between these two approaches with two benefits. First, it helps to better understand and improve the ROPE procedure. Second, it leads to a simple and effective algorithm for computing Bayes factor in a wide range of problems using draws from posterior distributions generated by standard Bayesian programs such as BUGS, JAGS and Stan. The tedious and error-prone task of coding custom-made software specific for Bayes factor is then avoided.

rate research

Can Bayes Factors Prove the Null Hypothesis?

69 - Michael Smithson 2019

It is possible to obtain a large Bayes Factor (BF) favoring the null hypothesis when both the null and alternative hypotheses have low likelihoods, and there are other hypotheses being ignored that are much more strongly supported by the data. As sample sizes become large it becomes increasingly probable that a strong BF favouring a point null against a conventional Bayesian vague alternative co-occurs with a BF favouring various specific alternatives against the null. For any BF threshold q and sample mean, there is a value n such that sample sizes larger than n guarantee that although the BF comparing H0 against a conventional (vague) alternative exceeds q, nevertheless for some range of hypothetical {mu}, a BF comparing H0 against {mu} in that range falls below 1/q. This paper discusses the conditions under which this conundrum occurs and investigates methods for resolving it.

Methodology

The Exact Equivalence of Distance and Kernel Methods for Hypothesis Testing

229 - Cencheng Shen , Joshua T. Vogelstein 2018

Distance-based tests, also called energy statistics, are leading methods for two-sample and independence tests from the statistics community. Kernel-based tests, developed from kernel mean embeddings, are leading methods for two-sample and independence tests from the machine learning community. A fixed-point transformation was previously proposed to connect the distance methods and kernel methods for the population statistics. In this paper, we propose a new bijective transformation between metrics and kernels. It simplifies the fixed-point transformation, inherits similar theoretical properties, allows distance methods to be exactly the same as kernel methods for sample statistics and p-value, and better preserves the data structure upon transformation. Our results further advance the understanding in distance and kernel-based tests, streamline the code base for implementing these tests, and enable a rich literature of distance-based and kernel-based methodologies to directly communicate with each other.

Machine Learning Machine Learning

A statistical Testing Procedure for Validating Class Labels

62 - Melissa C. Key , Ben Boukai 2020

Motivated by an open problem of validating protein identities in label-free shotgun proteomics work-flows, we present a testing procedure to validate class/protein labels using available measurements across instances/peptides. More generally, we present a solution to the problem of identifying instances that are deemed, based on some distance (or quasi-distance) measure, as outliers relative to the subset of instances assigned to the same class. The proposed procedure is non-parametric and requires no specific distributional assumption on the measured distances. The only assumption underlying the testing procedure is that measured distances between instances within the same class are stochastically smaller than measured distances between instances from different classes. The test is shown to simultaneously control the Type I and Type II error probabilities whilst also controlling the overall error probability of the repeated testing invoked in the validation procedure of initial class labeling. The theoretical results are supplemented with results from an extensive numerical study, simulating a typical setup for labeling validation in proteomics work-flow applications. These results illustrate the applicability and viability of our method. Even with up to 25% of instances mislabeled, our testing procedure maintains a high specificity and greatly reduces the proportion of mislabeled instances.

Methodology

Regression modelling of interval censored data based on the adaptive ridge procedure

95 - Olivier Bouaziz (MAP5 - UMR 8145 , UPD5 , IUT - Paris Descartes 2018

A new method for the analysis of time to ankylosis complication on a dataset of replanted teeth is proposed. In this context of left-censored, interval-censored and right-censored data, a Cox model with piecewise constant baseline hazard is introduced. Estimation is carried out with the EM algorithm by treating the true event times as unobserved variables. This estimation procedure is shown to produce a block diagonal Hessian matrix of the baseline parameters. Taking advantage of this interesting feature of the estimation method a L0 penalised likelihood method is implemented in order to automatically determine the number and locations of the cuts of the baseline hazard. This procedure allows to detect specific areas of time where patients are at greater risks for ankylosis. The method can be directly extended to the inclusion of exact observations and to a cure fraction. Theoretical results are obtained which allow to derive statistical inference of the model parameters from asymptotic likelihood theory. Through simulation studies, the penalisation technique is shown to provide a good fit of the baseline hazard and precise estimations of the resulting regression parameters.

Methodology

Hypothesis Testing for Hierarchical Structures in Cognitive Diagnosis Models

90 - Chenchen Ma , Gongjun Xu 2021

Cognitive Diagnosis Models (CDMs) are a special family of discrete latent variable models widely used in educational, psychological and social sciences. In many applications of CDMs, certain hierarchical structures among the latent attributes are assumed by researchers to characterize their dependence structure. Specifically, a directed acyclic graph is used to specify hierarchical constraints on the allowable configurations of the discrete latent attributes. In this paper, we consider the important yet unaddressed problem of testing the existence of latent hierarchical structures in CDMs. We first introduce the concept of testability of hierarchical structures in CDMs and present sufficient conditions. Then we study the asymptotic behaviors of the likelihood ratio test (LRT) statistic, which is widely used for testing nested models. Due to the irregularity of the problem, the asymptotic distribution of LRT becomes nonstandard and tends to provide unsatisfactory finite sample performance under practical conditions. We provide statistical insights on such failures, and propose to use parametric bootstrap to perform the testing. We also demonstrate the effectiveness and superiority of parametric bootstrap for testing the latent hierarchies over non-parametric bootstrap and the naive Chi-squared test through comprehensive simulations and an educational assessment dataset.

Methodology