ترغب بنشر مسار تعليمي؟ اضغط هنا

A Conditional Density Estimation Partition Model Using Logistic Gaussian Processes

298   0   0.0 ( 0 )
 نشر من قبل Bani Mallick
 تاريخ النشر 2017
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

Conditional density estimation (density regression) estimates the distribution of a response variable y conditional on covariates x. Utilizing a partition model framework, a conditional density estimation method is proposed using logistic Gaussian processes. The partition is created using a Voronoi tessellation and is learned from the data using a reversible jump Markov chain Monte Carlo algorithm. The Markov chain Monte Carlo algorithm is made possible through a Laplace approximation on the latent variables of the logistic Gaussian process model. This approximation marginalizes the parameters in each partition element, allowing an efficient search of the posterior distribution of the tessellation. The method has desirable consistency properties. In simulation and applications, the model successfully estimates the partition structure and conditional distribution of y.



قيم البحث

اقرأ أيضاً

Compared to the conditional mean as a simple point estimator, the conditional density function is more informative to describe the distributions with multi-modality, asymmetry or heteroskedasticity. In this paper, we propose a novel parametric condit ional density estimation method by showing the connection between the general density and the likelihood function of inhomogeneous Poisson process models. The maximum likelihood estimates can be obtained via weighted logistic regressions, and the computation can be significantly relaxed by combining a block-wise alternating maximization scheme and local case-control sampling. We also provide simulation studies for illustration.
Conditional density estimation generalizes regression by modeling a full density f(yjx) rather than only the expected value E(yjx). This is important for many tasks, including handling multi-modality and generating prediction intervals. Though fundam ental and widely applicable, nonparametric conditional density estimators have received relatively little attention from statisticians and little or none from the machine learning community. None of that work has been applied to greater than bivariate data, presumably due to the computational difficulty of data-driven bandwidth selection. We describe the double kernel conditional density estimator and derive fast dual-tree-based algorithms for bandwidth selection using a maximum likelihood criterion. These techniques give speedups of up to 3.8 million in our experiments, and enable the first applications to previously intractable large multivariate datasets, including a redshift prediction problem from the Sloan Digital Sky Survey.
130 - Zijun Gao , Trevor Hastie 2021
Conditional density estimation is a fundamental problem in statistics, with scientific and practical applications in biology, economics, finance and environmental studies, to name a few. In this paper, we propose a conditional density estimator based on gradient boosting and Lindseys method (LinCDE). LinCDE admits flexible modeling of the density family and can capture distributional characteristics like modality and shape. In particular, when suitably parametrized, LinCDE will produce smooth and non-negative density estimates. Furthermore, like boosted regression trees, LinCDE does automatic feature selection. We demonstrate LinCDEs efficacy through extensive simulations and several real data examples.
Model uncertainty quantification is an essential component of effective data assimilation. Model errors associated with sub-grid scale processes are often represented through stochastic parameterizations of the unresolved process. Many existing Stoch astic Parameterization schemes are only applicable when knowledge of the true sub-grid scale process or full observations of the coarse scale process are available, which is typically not the case in real applications. We present a methodology for estimating the statistics of sub-grid scale processes for the more realistic case that only partial observations of the coarse scale process are available. Model error realizations are estimated over a training period by minimizing their conditional sum of squared deviations given some informative covariates (e.g. state of the system), constrained by available observations and assuming that the observation errors are smaller than the model errors. From these realizations a conditional probability distribution of additive model errors given these covariates is obtained, allowing for complex non-Gaussian error structures. Random draws from this density are then used in actual ensemble data assimilation experiments. We demonstrate the efficacy of the approach through numerical experiments with the multi-scale Lorenz 96 system using both small and large time scale separations between slow (coarse scale) and fast (fine scale) variables. The resulting error estimates and forecasts obtained with this new method are superior to those from two existing methods.
Research on Poisson regression analysis for dependent data has been developed rapidly in the last decade. One of difficult problems in a multivariate case is how to construct a cross-correlation structure and at the meantime make sure that the covari ance matrix is positive definite. To address the issue, we propose to use convolved Gaussian process (CGP) in this paper. The approach provides a semi-parametric model and offers a natural framework for modeling common mean structure and covariance structure simultaneously. The CGP enables the model to define different covariance structure for each component of the response variables. This flexibility ensures the model to cope with data coming from different resources or having different data structures, and thus to provide accurate estimation and prediction. In addition, the model is able to accommodate large-dimensional covariates. The definition of the model, the inference and the implementation, as well as its asymptotic properties, are discussed. Comprehensive numerical examples with both simulation studies and real data are presented.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا