ترغب بنشر مسار تعليمي؟ اضغط هنا

Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting

182   0   0.0 ( 0 )
 نشر من قبل Niccol\\`o Dalmasso
 تاريخ النشر 2020
والبحث باللغة English




اسأل ChatGPT حول البحث

Parameter estimation, statistical tests and confidence sets are the cornerstones of classical statistics that allow scientists to make inferences about the underlying process that generated the observed data. A key question is whether one can still construct hypothesis tests and confidence sets with proper coverage and high power in a so-called likelihood-free inference (LFI) setting; that is, a setting where the likelihood is not explicitly known but one can forward-simulate observable data according to a stochastic model. In this paper, we present $texttt{ACORE}$ (Approximate Computation via Odds Ratio Estimation), a frequentist approach to LFI that first formulates the classical likelihood ratio test (LRT) as a parametrized classification problem, and then uses the equivalence of tests and confidence sets to build confidence regions for parameters of interest. We also present a goodness-of-fit procedure for checking whether the constructed tests and confidence regions are valid. $texttt{ACORE}$ is based on the key observation that the LRT statistic, the rejection probability of the test, and the coverage of the confidence set are conditional distribution functions which often vary smoothly as a function of the parameters of interest. Hence, instead of relying solely on samples simulated at fixed parameter settings (as is the convention in standard Monte Carlo solutions), one can leverage machine learning tools and data simulated in the neighborhood of a parameter to improve estimates of quantities of interest. We demonstrate the efficacy of $texttt{ACORE}$ with both theoretical and empirical results. Our implementation is available on Github.

قيم البحث

اقرأ أيضاً

Bayesian likelihood-free methods implement Bayesian inference using simulation of data from the model to substitute for intractable likelihood evaluations. Most likelihood-free inference methods replace the full data set with a summary statistic befo re performing Bayesian inference, and the choice of this statistic is often difficult. The summary statistic should be low-dimensional for computational reasons, while retaining as much information as possible about the parameter. Using a recent idea from the interpretable machine learning literature, we develop some regression-based diagnostic methods which are useful for detecting when different parts of a summary statistic vector contain conflicting information about the model parameters. Conflicts of this kind complicate summary statistic choice, and detecting them can be insightful about model deficiencies and guide model improvement. The diagnostic methods developed are based on regression approaches to likelihood-free inference, in which the regression model estimates the posterior density using summary statistics as features. Deletion and imputation of part of the summary statistic vector within the regression model can remove conflicts and approximate posterior distributions for summary statistic subsets. A larger than expected change in the estimated posterior density following deletion and imputation can indicate a conflict in which inferences of interest are affected. The usefulness of the new methods is demonstrated in a number of real examples.
Bayesian inference without the access of likelihood, or likelihood-free inference, has been a key research topic in simulations, to yield a more realistic generation result. Recent likelihood-free inference updates an approximate posterior sequential ly with the dataset of the cumulative simulation input-output pairs over inference rounds. Therefore, the dataset is gathered through the iterative simulations with sampled inputs from a proposal distribution by MCMC, which becomes the key of inference quality in this sequential framework. This paper introduces a new proposal modeling, named as Implicit Surrogate Proposal (ISP), to generate a cumulated dataset with further sample efficiency. ISP constructs the cumulative dataset in the most diverse way by drawing i.i.d samples via a feed-forward fashion, so the posterior inference does not suffer from the disadvantages of MCMC caused by its non-i.i.d nature, such as auto-correlation and slow mixing. We analyze the convergence property of ISP in both theoretical and empirical aspects to guarantee that ISP provides an asymptotically exact sampler. We demonstrate that ISP outperforms the baseline inference algorithms on simulations with multi-modal posteriors.
119 - Umberto Picchini 2016
A maximum likelihood methodology for the parameters of models with an intractable likelihood is introduced. We produce a likelihood-free version of the stochastic approximation expectation-maximization (SAEM) algorithm to maximize the likelihood func tion of model parameters. While SAEM is best suited for models having a tractable complete likelihood function, its application to moderately complex models is a difficult or even impossible task. We show how to construct a likelihood-free version of SAEM by using the synthetic likelihood paradigm. Our method is completely plug-and-play, requires almost no tuning and can be applied to both static and dynamic models. Four simulation studies illustrate the method, including a stochastic differential equation model, a stochastic Lotka-Volterra model and data from $g$-and-$k$ distributions. MATLAB code is available as supplementary material.
This document is an invited chapter covering the specificities of ABC model choice, intended for the incoming Handbook of ABC by Sisson, Fan, and Beaumont (2017). Beyond exposing the potential pitfalls of ABC based posterior probabilities, the review emphasizes mostly the solution proposed by Pudlo et al. (2016) on the use of random forests for aggregating summary statistics and and for estimating the posterior probability of the most likely model via a secondary random fores.
Confidence intervals based on penalized maximum likelihood estimators such as the LASSO, adaptive LASSO, and hard-thresholding are analyzed. In the known-variance case, the finite-sample coverage properties of such intervals are determined and it is shown that symmetric intervals are the shortest. The length of the shortest intervals based on the hard-thresholding estimator is larger than the length of the shortest interval based on the adaptive LASSO, which is larger than the length of the shortest interval based on the LASSO, which in turn is larger than the standard interval based on the maximum likelihood estimator. In the case where the penalized estimators are tuned to possess the `sparsity property, the intervals based on these estimators are larger than the standard interval by an order of magnitude. Furthermore, a simple asymptotic confidence interval construction in the `sparse case, that also applies to the smoothly clipped absolute deviation estimator, is discussed. The results for the known-variance case are shown to carry over to the unknown-variance case in an appropriate asymptotic sense.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا