ترغب بنشر مسار تعليمي؟ اضغط هنا

The Saddle Point Problem of Polynomials

122   0   0.0 ( 0 )
 نشر من قبل Jiawang Nie
 تاريخ النشر 2018
  مجال البحث
والبحث باللغة English




اسأل ChatGPT حول البحث

This paper studies the saddle point problem of polynomials. We give an algorithm for computing saddle points. It is based on solving Lasserres hierarchy of semidefinite relaxations. Under some genericity assumptions on defining polynomials, we show that: i) if there exists a saddle point, our algorithm can get one by solving a finite number of Lasserre type semidefinite relaxations; ii) if there is no saddle point, our algorithm can detect its nonexistence.



قيم البحث

اقرأ أيضاً

This paper studies the generalization bounds for the empirical saddle point (ESP) solution to stochastic saddle point (SSP) problems. For SSP with Lipschitz continuous and strongly convex-strongly concave objective functions, we establish an $mathcal {O}(1/n)$ generalization bound by using a uniform stability argument. We also provide generalization bounds under a variety of assumptions, including the cases without strong convexity and without bounded domains. We illustrate our results in two examples: batch policy learning in Markov decision process, and mixed strategy Nash equilibrium estimation for stochastic games. In each of these examples, we show that a regularized ESP solution enjoys a near-optimal sample complexity. To the best of our knowledge, this is the first set of results on the generalization theory of ESP.
In this paper, we focus on solving a class of constrained non-convex non-concave saddle point problems in a decentralized manner by a group of nodes in a network. Specifically, we assume that each node has access to a summand of a global objective fu nction and nodes are allowed to exchange information only with their neighboring nodes. We propose a decentralized variant of the proximal point method for solving this problem. We show that when the objective function is $rho$-weakly convex-weakly concave the iterates converge to approximate stationarity with a rate of $mathcal{O}(1/sqrt{T})$ where the approximation error depends linearly on $sqrt{rho}$. We further show that when the objective function satisfies the Minty VI condition (which generalizes the convex-concave case) we obtain convergence to stationarity with a rate of $mathcal{O}(1/sqrt{T})$. To the best of our knowledge, our proposed method is the first decentralized algorithm with theoretical guarantees for solving a non-convex non-concave decentralized saddle point problem. Our numerical results for training a general adversarial network (GAN) in a decentralized manner match our theoretical guarantees.
On solving a convex-concave bilinear saddle-point problem (SPP), there have been many works studying the complexity results of first-order methods. These results are all about upper complexity bounds, which can determine at most how many efforts woul d guarantee a solution of desired accuracy. In this paper, we pursue the opposite direction by deriving lower complexity bounds of first-order methods on large-scale SPPs. Our results apply to the methods whose iterates are in the linear span of past first-order information, as well as more general methods that produce their iterates in an arbitrary manner based on first-order information. We first work on the affinely constrained smooth convex optimization that is a special case of SPP. Different from gradient method on unconstrained problems, we show that first-order methods on affinely constrained problems generally cannot be accelerated from the known convergence rate $O(1/t)$ to $O(1/t^2)$, and in addition, $O(1/t)$ is optimal for convex problems. Moreover, we prove that for strongly convex problems, $O(1/t^2)$ is the best possible convergence rate, while it is known that gradient methods can have linear convergence on unconstrained problems. Then we extend these results to general SPPs. It turns out that our lower complexity bounds match with several established upper complexity bounds in the literature, and thus they are tight and indicate the optimality of several existing first-order methods.
Nonsmooth optimization problems arising in practice tend to exhibit beneficial smooth substructure: their domains stratify into active manifolds of smooth variation, which common proximal algorithms identify in finite time. Identification then entail s a transition to smooth dynamics, and accommodates second-order acceleration techniques. While identification is clearly useful algorithmically, empirical evidence suggests that even those algorithms that do not identify the active manifold in finite time -- notably the subgradient method -- are nonetheless affected by it. This work seeks to explain this phenomenon, asking: how do active manifolds impact the subgradient method in nonsmooth optimization? In this work, we answer this question by introducing two algorithmically useful properties -- aiming and subgradient approximation -- that fully expose the smooth substructure of the problem. We show that these properties imply that the shadow of the (stochastic) subgradient method along the active manifold is precisely an inexact Riemannian gradient method with an implicit retraction. We prove that these properties hold for a wide class of problems, including cone reducible/decomposable functions and generic semialgebraic problems. Moreover, we develop a thorough calculus, proving such properties are preserved under smooth deformations and spectral lifts. This viewpoint then leads to several algorithmic consequences that parallel results in smooth optimization, despite the nonsmoothness of the problem: local rates of convergence, asymptotic normality, and saddle point avoidance. The asymptotic normality results appear to be new even in the most classical setting of stochastic nonlinear programming. The results culminate in the following observation: the perturbed subgradient method on generic, Clarke regular semialgebraic problems, converges only to local minimizers.
This paper considers the problem of designing accelerated gradient-based algorithms for optimization and saddle-point problems. The class of objective functions is defined by a generalized sector condition. This class of functions contains strongly c onvex functions with Lipschitz gradients but also non-convex functions, which allows not only to address optimization problems but also saddle-point problems. The proposed design procedure relies on a suitable class of Lyapunov functions and on convex semi-definite programming. The proposed synthesis allows the design of algorithms that reach the performance of state-of-the-art accelerated gradient methods and beyond.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا