Distributionally Robust Local Non-parametric Conditional Estimation

367 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Viet Anh Nguyen

تاريخ النشر 2020

مجال البحث الاحصاء الرياضي الهندسة المعلوماتية

والبحث باللغة English

تأليف Viet Anh Nguyen - Fan Zhang - Jose Blanchet

التعلم الالي التعلم الآلي نظرية الإحصاء

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Conditional estimation given specific covariate values (i.e., local conditional estimation or functional estimation) is ubiquitously useful with applications in engineering, social and natural sciences. Existing data-driven non-parametric estimators mostly focus on structured homogeneous data (e.g., weakly independent and stationary data), thus they are sensitive to adversarial noise and may perform poorly under a low sample size. To alleviate these issues, we propose a new distributionally robust estimator that generates non-parametric local estimates by minimizing the worst-case conditional expected loss over all adversarial distributions in a Wasserstein ambiguity set. We show that despite being generally intractable, the local estimator can be efficiently found via convex optimization under broadly applicable settings, and it is robust to the corruption and heterogeneity of the data. Experiments with synthetic and MNIST datasets show the competitive performance of this new class of estimators.

قيم البحث

84 - Weibin Mo , Zhengling Qi , Yufeng Liu 2020

Recent development in the data-driven decision science has seen great advances in individualized decision making. Given data with individual covariates, treatment assignments and outcomes, policy makers best individualized treatment rule (ITR) that m aximizes the expected outcome, known as the value function. Many existing methods assume that the training and testing distributions are the same. However, the estimated optimal ITR may have poor generalizability when the training and testing distributions are not identical. In this paper, we consider the problem of finding an optimal ITR from a restricted ITR class where there is some unknown covariate changes between the training and testing distributions. We propose a novel distributionally robust ITR (DR-ITR) framework that maximizes the worst-case value function across the values under a set of underlying distributions that are close to the training distribution. The resulting DR-ITR can guarantee the performance among all such distributions reasonably well. We further propose a calibrating procedure that tunes the DR-ITR adaptively to a small amount of calibration data from a target population. In this way, the calibrated DR-ITR can be shown to enjoy better generalizability than the standard ITR based on our numerical studies.

التعلم الالي التعلم الآلي

Wasserstein Distributionally Robust Optimization: Theory and Applications in Machine Learning

129 - Daniel Kuhn , Peyman Mohajerin Esfahani , Viet Anh Nguyen 2019

Many decision problems in science, engineering and economics are affected by uncertain parameters whose distribution is only indirectly observable through samples. The goal of data-driven decision-making is to learn a decision from finitely many trai ning samples that will perform well on unseen test samples. This learning task is difficult even if all training and test samples are drawn from the same distribution---especially if the dimension of the uncertainty is large relative to the training sample size. Wasserstein distributionally robust optimization seeks data-driven decisions that perform well under the most adverse distribution within a certain Wasserstein distance from a nominal distribution constructed from the training samples. In this tutorial we will argue that this approach has many conceptual and computational benefits. Most prominently, the optimal decisions can often be computed by solving tractable convex optimization problems, and they enjoy rigorous out-of-sample and asymptotic consistency guarantees. We will also show that Wasserstein distributionally robust optimization has interesting ramifications for statistical learning and motivates new approaches for fundamental learning tasks such as classification, regression, maximum likelihood estimation or minimum mean square error estimation, among others.

التعلم الالي التعلم الآلي التحسين والتحكم

Distributionally Robust Martingale Optimal Transport

147 - Zhengqing Zhou , Jose Blanchet , Peter W. Glynn 2021

We study the problem of bounding path-dependent expectations (within any finite time horizon $d$) over the class of discrete-time martingales whose marginal distributions lie within a prescribed tolerance of a given collection of benchmark marginal d istributions. This problem is a relaxation of the martingale optimal transport (MOT) problem and is motivated by applications to super-hedging in financial markets. We show that the empirical version of our relaxed MOT problem can be approximated within $Oleft( n^{-1/2}right)$ error where $n$ is the number of samples of each of the individual marginal distributions (generated independently) and using a suitably constructed finite-dimensional linear programming problem.

الاحتمالات التحسين والتحكم نظرية الإحصاء

Noise Regularization for Conditional Density Estimation

187 - Jonas Rothfuss , Fabio Ferreira , Simon Boehm 2019

Modelling statistical relationships beyond the conditional mean is crucial in many settings. Conditional density estimation (CDE) aims to learn the full conditional probability density from data. Though highly expressive, neural network based CDE mod els can suffer from severe over-fitting when trained with the maximum likelihood objective. Due to the inherent structure of such models, classical regularization approaches in the parameter space are rendered ineffective. To address this issue, we develop a model-agnostic noise regularization method for CDE that adds random perturbations to the data during training. We demonstrate that the proposed approach corresponds to a smoothness regularization and prove its asymptotic consistency. In our experiments, noise regularization significantly and consistently outperforms other regularization methods across seven data sets and three CDE models. The effectiveness of noise regularization makes neural network based CDE the preferable method over previous non- and semi-parametric approaches, even when training data is scarce.

التعلم الالي التعلم الآلي

RFCDE: Random Forests for Conditional Density Estimation

92 - Taylor Pospisil , Ann B. Lee 2018

Random forests is a common non-parametric regression technique which performs well for mixed-type data and irrelevant covariates, while being robust to monotonic variable transformations. Existing random forest implementations target regression or cl assification. We introduce the RFCDE package for fitting random forest models optimized for nonparametric conditional density estimation, including joint densities for multiple responses. This enables analysis of conditional probability distributions which is useful for propagating uncertainty and of joint distributions that describe relationships between multiple responses and covariates. RFCDE is released under the MIT open-source license and can be accessed at https://github.com/tpospisi/rfcde . Both R and Pyth

التعلم الالي التعلم الآلي