High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation

114 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Rafael Izbicki Rafael Izbicki

تاريخ النشر 2014

مجال البحث الاحصاء الرياضي

والبحث باللغة English

تأليف Rafael Izbicki - Ann B. Lee - Chad M. Schafer

المنهجية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The ratio between two probability density functions is an important component of various tasks, including selection bias correction, novelty detection and classification. Recently, several estimators of this ratio have been proposed. Most of these methods fail if the sample space is high-dimensional, and hence require a dimension reduction step, the result of which can be a significant loss of information. Here we propose a simple-to-implement, fully nonparametric density ratio estimator that expands the ratio in terms of the eigenfunctions of a kernel-based operator; these functions reflect the underlying geometry of the data (e.g., submanifold structure), often leading to better estimates without an explicit dimension reduction step. We show how our general framework can be extended to address another important problem, the estimation of a likelihood function in situations where that function cannot be well-approximated by an analytical form. One is often faced with this situation when performing statistical inference with data from the sciences, due the complexity of the data and of the processes that generated those data. We emphasize applications where using existing likelihood-free methods of inference would be challenging due to the high dimensionality of the sample space, but where our spectral series method yields a reasonable estimate of the likelihood function. We provide theoretical guarantees and illustrate the effectiveness of our proposed method with numerical experiments.

قيم البحث

207 - Umberto Picchini , Rachele Anderson 2015

A maximum likelihood methodology for a general class of models is presented, using an approximate Bayesian computation (ABC) approach. The typical target of ABC methods are models with intractable likelihoods, and we combine an ABC-MCMC sampler with so-called data cloning for maximum likelihood estimation. Accuracy of ABC methods relies on the use of a small threshold value for comparing simulations from the model and observed data. The proposed methodology shows how to use large threshold values, while the number of data-clones is increased to ease convergence towards an approximate maximum likelihood estimate. We show how to exploit the methodology to reduce the number of iterations of a standard ABC-MCMC algorithm and therefore reduce the computational effort, while obtaining reasonable point estimates. Simulation studies show the good performance of our approach on models with intractable likelihoods such as g-and-k distributions, stochastic differential equations and state-space models.

المنهجية حساب

ABC-CDE: Towards Approximate Bayesian Computation with Complex High-Dimensional Data and Limited Simulations

63 - Rafael Izbicki , Ann B. Lee , Taylor Pospisil 2018

Approximate Bayesian Computation (ABC) is typically used when the likelihood is either unavailable or intractable but where data can be simulated under different parameter settings using a forward model. Despite the recent interest in ABC, high-dimen sional data and costly simulations still remain a bottleneck in some applications. There is also no consensus as to how to best assess the performance of such methods without knowing the true posterior. We show how a nonparametric conditional density estimation (CDE) framework, which we refer to as ABC-CDE, help address three nontrivial challenges in ABC: (i) how to efficiently estimate the posterior distribution with limited simulations and different types of data, (ii) how to tune and compare the performance of ABC and related methods in estimating the posterior itself, rather than just certain properties of the density, and (iii) how to efficiently choose among a large set of summary statistics based on a CDE surrogate loss. We provide theoretical and empirical evidence that justify ABC-CDE procedures that {em directly} estimate and assess the posterior based on an initial ABC sample, and we describe settings where standard ABC and regression-based approaches are inadequate.

المنهجية التعلم الالي

Diagonal Likelihood Ratio Test for Equality of Mean Vectors in High-Dimensional Data

94 - Zongliang Hu , Tiejun Tong , Marc G. Genton 2017

We propose a likelihood ratio test framework for testing normal mean vectors in high-dimensional data under two common scenarios: the one-sample test and the two-sample test with equal covariance matrices. We derive the test statistics under the assu mption that the covariance matrices follow a diagonal matrix structure. In comparison with the diagonal Hotellings tests, our proposed test statistics display some interesting characteristics. In particular, they are a summation of the log-transformed squared $t$-statistics rather than a direct summation of those components. More importantly, to derive the asymptotic normality of our test statistics under the null and local alternative hypotheses, we do not need the requirement that the covariance matrices follow a diagonal matrix structure. As a consequence, our proposed test methods are very flexible and readily applicable in practice. Simulation studies and a real data analysis are also carried out to demonstrate the advantages of our likelihood ratio test methods.

المنهجية

Nonparametric empirical Bayes and maximum likelihood estimation for high-dimensional data analysis

134 - Lee H. Dicker , Sihai D. Zhao 2014

Nonparametric empirical Bayes methods provide a flexible and attractive approach to high-dimensional data analysis. One particularly elegant empirical Bayes methodology, involving the Kiefer-Wolfowitz nonparametric maximum likelihood estimator (NPMLE ) for mixture models, has been known for decades. However, implementation and theoretical analysis of the Kiefer-Wolfowitz NPMLE are notoriously difficult. A fast algorithm was recently proposed that makes NPMLE-based procedures feasible for use in large-scale problems, but the algorithm calculates only an approximation to the NPMLE. In this paper we make two contributions. First, we provide upper bounds on the convergence rate of the approximate NPMLEs statistical error, which have the same order as the best known bounds for the true NPMLE. This suggests that the approximate NPMLE is just as effective as the true NPMLE for statistical applications. Second, we illustrate the promise of NPMLE procedures in a high-dimensional binary classification problem. We propose a new procedure and show that it vastly outperforms existing methods in experiments with simulated data. In real data analyses involving cancer survival and gene expression data, we show that it is very competitive with several recently proposed methods for regularized linear discriminant analysis, another popular approach to high-dimensional classification.

المنهجية

High-dimensional empirical likelihood inference

222 - Jinyuan Chang , Song Xi Chen , Cheng Yong Tang 2018

High-dimensional statistical inference with general estimating equations are challenging and remain less explored. In this paper, we study two problems in the area: confidence set estimation for multiple components of the model parameters, and model specifications test. For the first one, we propose to construct a new set of estimating equations such that the impact from estimating the high-dimensional nuisance parameters becomes asymptotically negligible. The new construction enables us to estimate a valid confidence region by empirical likelihood ratio. For the second one, we propose a test statistic as the maximum of the marginal empirical likelihood ratios to quantify data evidence against the model specification. Our theory establishes the validity of the proposed empirical likelihood approaches, accommodating over-identification and exponentially growing data dimensionality. The numerical studies demonstrate promising performance and potential practical benefits of the new methods.

المنهجية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة طرطوس

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

High-Dimensional Density Ratio Estimation with Extensions to Approximate Likelihood Computation

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً