Approximation Algorithms for Distributionally Robust Stochastic Optimization with Black-Box Distributions

88 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Chaitanya Swamy

تاريخ النشر 2019

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Andre Linhares - Chaitanya Swamy

بنى وهياكل البيانات والخوارزميات

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Two-stage stochastic optimization is a framework for modeling uncertainty, where we have a probability distribution over possible realizations of the data, called scenarios, and decisions are taken in two stages: we make first-stage decisions knowing only the underlying distribution and before a scenario is realized, and may take additional second-stage recourse actions after a scenario is realized. The goal is typically to minimize the total expected cost. A criticism of this model is that the underlying probability distribution is itself often imprecise! To address this, a versatile approach that has been proposed is the {em distributionally robust 2-stage model}: given a collection of probability distributions, our goal now is to minimize the maximum expected total cost with respect to a distribution in this collection. We provide a framework for designing approximation algorithms in such settings when the collection is a ball around a central distribution and the central distribution is accessed {em only via a sampling black box}. We first show that one can utilize the {em sample average approximation} (SAA) method to reduce the problem to the case where the central distribution has {em polynomial-size} support. We then show how to approximately solve a fractional relaxation of the SAA (i.e., polynomial-scenario central-distribution) problem. By complementing this via LP-rounding algorithms that provide {em local} (i.e., per-scenario) approximation guarantees, we obtain the {em first} approximation algorithms for the distributionally robus

قيم البحث

596 - Shipra Agrawal , Yichuan Ding , Amin Saberi 2009

We consider a robust model proposed by Scarf, 1958, for stochastic optimization when only the marginal probabilities of (binary) random variables are given, and the correlation between the random variables is unknown. In the robust model, the objecti ve is to minimize expected cost against worst possible joint distribution with those marginals. We introduce the concept of correlation gap to compare this model to the stochastic optimization model that ignores correlations and minimizes expected cost under independent Bernoulli distribution. We identify a class of functions, using concepts of summable cost sharing schemes from game theory, for which the correlation gap is well-bounded and the robust model can be approximated closely by the independent distribution model. As a result, we derive efficient approximation factors for many popular cost functions, like submodular functions, facility location, and Steiner tree. As a byproduct, our analysis also yields some new results in the areas of social welfare maximization and existence of Walrasian equilibria, which may be of independent interest.

بنى وهياكل البيانات والخوارزميات

Approximation Algorithms for Minimum Norm and Ordered Optimization Problems

86 - Deeparnab Chakrabarty , Chaitanya Swamy 2018

$ $In many optimization problems, a feasible solution induces a multi-dimensional cost vector. For example, in load-balancing a schedule induces a load vector across the machines. In $k$-clustering, opening $k$ facilities induces an assignment cost v ector across the clients. In this paper we consider the following minimum norm optimization problem : Given an arbitrary monotone, symmetric norm, find a solution which minimizes the norm of the induced cost-vector. This generalizes many fundamental NP-hard problems. We give a general framework to tackle the minimum norm problem, and illustrate its efficacy in the unrelated machine load balancing and $k$-clustering setting. Our concrete results are the following. $bullet$ We give constant factor approximation algorithms for the minimum norm load balancing problem in unrelated machines, and the minimum norm $k$-clustering problem. To our knowledge, our results constitute the first constant-factor approximations for such a general suite of objectives. $bullet$ In load balancing with unrelated machines, we give a $2$-approximation for the problem of finding an assignment minimizing the sum of the largest $ell$ loads, for any $ell$. We give a $(2+varepsilon)$-approximation for the so-called ordered load-balancing problem. $bullet$ For $k$-clustering, we give a $(5+varepsilon)$-approximation for the ordered $k$-median problem significantly improving the constant factor approximations from Byrka, Sornat, and Spoerhase (STOC 2018) and Chakrabarty and Swamy (ICALP 2018). $bullet$ Our techniques also imply $O(1)$ approximations to the best simultaneous optimization factor for any instance of the unrelated machine load-balancing and the $k$-clustering setting. To our knowledge, these are the first positive simultaneous optimization results in these settings.

بنى وهياكل البيانات والخوارزميات

Kernel Distributionally Robust Optimization

256 - Jia-Jie Zhu , Wittawat Jitkrittum , Moritz Diehl 2020

We propose kernel distributionally robust optimization (Kernel DRO) using insights from the robust optimization theory and functional analysis. Our method uses reproducing kernel Hilbert spaces (RKHS) to construct a wide range of convex ambiguity set s, which can be generalized to sets based on integral probability metrics and finite-order moment bounds. This perspective unifies multiple existing robust and stochastic optimization methods. We prove a theorem that generalizes the classical duality in the mathematical problem of moments. Enabled by this theorem, we reformulate the maximization with respect to measures in DRO into the dual program that searches for RKHS functions. Using universal RKHSs, the theorem applies to a broad class of loss functions, lifting common limitations such as polynomial losses and knowledge of the Lipschitz constant. We then establish a connection between DRO and stochastic optimization with expectation constraints. Finally, we propose practical algorithms based on both batch convex solvers and stochastic functional gradient, which apply to general optimization and machine learning tasks.

التحسين والتحكم التعلم الآلي التعلم الالي

Distributionally Robust Optimization with Moment Ambiguity Sets

205 - Jiawang Nie , Liu Yang , Suhan Zhong 2021

This paper studies distributionally robust optimization (DRO) when the ambiguity set is given by moments for the distributions. The objective and constraints are given by polynomials in decision variables. We reformulate the DRO with equivalent momen t conic constraints. Under some general assumptions, we prove the DRO is equivalent to a linear optimization problem with moment and psd polynomial cones. A moment-SOS relaxation method is proposed to solve it. Its asymptotic and finite convergence are shown under certain assumptions. Numerical examples are presented to show how to solve DRO problems.

التحسين والتحكم

Large-Scale Methods for Distributionally Robust Optimization

177 - Daniel Levy , Yair Carmon , John C. Duchi 2020

We propose and analyze algorithms for distributionally robust optimization of convex losses with conditional value at risk (CVaR) and $chi^2$ divergence uncertainty sets. We prove that our algorithms require a number of gradient evaluations independe nt of training set size and number of parameters, making them suitable for large-scale applications. For $chi^2$ uncertainty sets these are the first such guarantees in the literature, and for CVaR our guarantees scale linearly in the uncertainty level rather than quadratically as in previous work. We also provide lower bounds proving the worst-case optimality of our algorithms for CVaR and a penalized version of the $chi^2$ problem. Our primary technical contributions are novel bounds on the bias of batch robust risk estimation and the variance of a multilevel Monte Carlo gradient estimator due to [Blanchet & Glynn, 2015]. Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.

التحسين والتحكم التعلم الآلي التعلم الالي

سجل دخول لتتمكن من نشر تعليقات