Generalization Bounds for Stochastic Saddle Point Problems

68 0 0.0 ( 0 )

Download Cite

Added by Junyu Zhang

Publication date 2020

fields

and research's language is English

Authors Junyu Zhang - Mingyi Hong - Mengdi Wang

Optimization and Control

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper studies the generalization bounds for the empirical saddle point (ESP) solution to stochastic saddle point (SSP) problems. For SSP with Lipschitz continuous and strongly convex-strongly concave objective functions, we establish an $mathcal{O}(1/n)$ generalization bound by using a uniform stability argument. We also provide generalization bounds under a variety of assumptions, including the cases without strong convexity and without bounded domains. We illustrate our results in two examples: batch policy learning in Markov decision process, and mixed strategy Nash equilibrium estimation for stochastic games. In each of these examples, we show that a regularized ESP solution enjoys a near-optimal sample complexity. To the best of our knowledge, this is the first set of results on the generalization theory of ESP.

rate research

Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems

75 - Yuyuan Ouyang , Yangyang Xu 2018

On solving a convex-concave bilinear saddle-point problem (SPP), there have been many works studying the complexity results of first-order methods. These results are all about upper complexity bounds, which can determine at most how many efforts would guarantee a solution of desired accuracy. In this paper, we pursue the opposite direction by deriving lower complexity bounds of first-order methods on large-scale SPPs. Our results apply to the methods whose iterates are in the linear span of past first-order information, as well as more general methods that produce their iterates in an arbitrary manner based on first-order information. We first work on the affinely constrained smooth convex optimization that is a special case of SPP. Different from gradient method on unconstrained problems, we show that first-order methods on affinely constrained problems generally cannot be accelerated from the known convergence rate $O(1/t)$ to $O(1/t^2)$, and in addition, $O(1/t)$ is optimal for convex problems. Moreover, we prove that for strongly convex problems, $O(1/t^2)$ is the best possible convergence rate, while it is known that gradient methods can have linear convergence on unconstrained problems. Then we extend these results to general SPPs. It turns out that our lower complexity bounds match with several established upper complexity bounds in the literature, and thus they are tight and indicate the optimality of several existing first-order methods.

Optimization and Control

A Decentralized Proximal Point-type Method for Saddle Point Problems

104 - Weijie Liu , Aryan Mokhtari , Asuman Ozdaglar 2019

In this paper, we focus on solving a class of constrained non-convex non-concave saddle point problems in a decentralized manner by a group of nodes in a network. Specifically, we assume that each node has access to a summand of a global objective function and nodes are allowed to exchange information only with their neighboring nodes. We propose a decentralized variant of the proximal point method for solving this problem. We show that when the objective function is $rho$-weakly convex-weakly concave the iterates converge to approximate stationarity with a rate of $mathcal{O}(1/sqrt{T})$ where the approximation error depends linearly on $sqrt{rho}$. We further show that when the objective function satisfies the Minty VI condition (which generalizes the convex-concave case) we obtain convergence to stationarity with a rate of $mathcal{O}(1/sqrt{T})$. To the best of our knowledge, our proposed method is the first decentralized algorithm with theoretical guarantees for solving a non-convex non-concave decentralized saddle point problem. Our numerical results for training a general adversarial network (GAN) in a decentralized manner match our theoretical guarantees.

Optimization and Control Machine Learning Machine Learning

Convex Synthesis of Accelerated Gradient Algorithms for Optimization and Saddle Point Problems using Lyapunov functions

170 - Dennis Gramlich , Christian Ebenbauer , Carsten W. Scherer 2020

This paper considers the problem of designing accelerated gradient-based algorithms for optimization and saddle-point problems. The class of objective functions is defined by a generalized sector condition. This class of functions contains strongly convex functions with Lipschitz gradients but also non-convex functions, which allows not only to address optimization problems but also saddle-point problems. The proposed design procedure relies on a suitable class of Lyapunov functions and on convex semi-definite programming. The proposed synthesis allows the design of algorithms that reach the performance of state-of-the-art accelerated gradient methods and beyond.

Optimization and Control Systems and Control Systems and Control

The Saddle Point Problem of Polynomials

121 - Jiawang Nie , Zi Yang , Guangming Zhou 2018

This paper studies the saddle point problem of polynomials. We give an algorithm for computing saddle points. It is based on solving Lasserres hierarchy of semidefinite relaxations. Under some genericity assumptions on defining polynomials, we show that: i) if there exists a saddle point, our algorithm can get one by solving a finite number of Lasserre type semidefinite relaxations; ii) if there is no saddle point, our algorithm can detect its nonexistence.

Optimization and Control

A GenEO Domain Decomposition method for Saddle Point problems

69 - Frederic Nataf (Laboratory J.L. Lions , Sorbonne Universite , CNRSn UMR 7598 2019

We introduce an adaptive element-based domain decomposition (DD) method for solving saddle point problems defined as a block two by two matrix. The algorithm does not require any knowledge of the constrained space. We assume that all sub matrices are sparse and that the diagonal blocks are spectrally equivalent to a sum of positive semi definite matrices. The latter assumption enables the design of adaptive coarse space for DD methods that extends the GenEO theory to saddle point problems. Numerical results on three dimensional elasticity problems for steel-rubber structures discretized by a finite element with continuous pressure are shown for up to one billion degrees of freedom.

Distributed Parallel and Cluster Computing Numerical Analysis Numerical Analysis