A Fast Gradient and Function Sampling Method for Finite Max-Functions

96 0 0.0 ( 0 )

Download Cite

Added by Elias Salom\\~ao Helou Neto

Publication date 2017

fields

and research's language is English

Authors Elias S. Helou - Sandra A. Santos - Lucas E. A. Sim~oes

Optimization and Control

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper tackles the unconstrained minimization of a class of nonsmooth and nonconvex functions that can be written as finite max-functions. A gradient and function-based sampling method is proposed which, under special circumstances, either moves superlinearly to a minimizer of the problem of interest or superlinearly improves the optimality certificate. Global and local convergence analysis are presented, as well as illustrative examples that corroborate and elucidate the obtained theoretical results.

rate research

A fast randomized incremental gradient method for decentralized non-convex optimization

329 - Ran Xin , Usman A. Khan , Soummya Kar 2020

We study decentralized non-convex finite-sum minimization problems described over a network of nodes, where each node possesses a local batch of data samples. In this context, we analyze a single-timescale randomized incremental gradient method, called GT-SAGA. GT-SAGA is computationally efficient as it evaluates one component gradient per node per iteration and achieves provably fast and robust performance by leveraging node-level variance reduction and network-level gradient tracking. For general smooth non-convex problems, we show the almost sure and mean-squared convergence of GT-SAGA to a first-order stationary point and further describe regimes of practical significance where it outperforms the existing approaches and achieves a network topology-independent iteration complexity respectively. When the global function satisfies the Polyak-Lojaciewisz condition, we show that GT-SAGA exhibits linear convergence to an optimal solution in expectation and describe regimes of practical interest where the performance is network topology-independent and improves upon the existing methods. Numerical experiments are included to highlight the main convergence aspects of GT-SAGA in non-convex settings.

Optimization and Control Machine Learning Systems and Control

Solving Non-Convex Non-Differentiable Min-Max Games using Proximal Gradient Method

626 - Babak Barazandeh , Meisam Razaviyayn 2020

Min-max saddle point games appear in a wide range of applications in machine leaning and signal processing. Despite their wide applicability, theoretical studies are mostly limited to the special convex-concave structure. While some recent works generalized these results to special smooth non-convex cases, our understanding of non-smooth scenarios is still limited. In this work, we study special form of non-smooth min-max games when the objective function is (strongly) convex with respect to one of the players decision variable. We show that a simple multi-step proximal gradient descent-ascent algorithm converges to $epsilon$-first-order Nash equilibrium of the min-max game with the number of gradient evaluations being polynomial in $1/epsilon$. We will also show that our notion of stationarity is stronger than existing ones in the literature. Finally, we evaluate the performance of the proposed algorithm through adversarial attack on a LASSO estimator.

Optimization and Control Computer Science and Game Theory Machine Learning

A Stochastic Gradient Method with an Exponential Convergence Rate for Finite Training Sets

302 - Nicolas Le Roux , Mark Schmidtn (INRIA Paris - Rocquencourt 2012

We propose a new stochastic gradient method for optimizing the sum of a finite set of smooth functions, where the sum is strongly convex. While standard stochastic gradient methods converge at sublinear rates for this problem, the proposed method incorporates a memory of previous gradient values in order to achieve a linear convergence rate. In a machine learning context, numerical experiments indicate that the new algorithm can dramatically outperform standard algorithms, both in terms of optimizing the training error and reducing the test error quickly.

Optimization and Control Machine Learning

A Gradient Method for Multilevel Optimization

59 - Ryo Sato , Mirai Tanaka , Akiko Takeda 2021

Although application examples of multilevel optimization have already been discussed since the 90s, the development of solution methods was almost limited to bilevel cases due to the difficulty of the problem. In recent years, in machine learning, Franceschi et al. have proposed a method for solving bilevel optimization problems by replacing their lower-level problems with the $T$ steepest descent update equations with some prechosen iteration number $T$. In this paper, we have developed a gradient-based algorithm for multilevel optimization with $n$ levels based on their idea and proved that our reformulation with $n T$ variables asymptotically converges to the original multilevel problem. As far as we know, this is one of the first algorithms with some theoretical guarantee for multilevel optimization. Numerical experiments show that a trilevel hyperparameter learning model considering data poisoning produces more stable prediction results than an existing bilevel hyperparameter learning model in noisy data settings.

Optimization and Control Machine Learning

The ridge method for tame min-max problems

185 - Edouard Pauwels 2021

We study the ridge method for min-max problems, and investigate its convergence without any convexity, differentiability or qualification assumption. The central issue is to determine whether the parametric optimality formula provides a conservative field, a notion of generalized derivative well suited for optimization. The answer to this question is positive in a semi-algebraic, and more generally definable, context. The proof involves a new characterization of definable conservative fields which is of independent interest. As a consequence, the ridge method applied to definable objectives is proved to have a minimizing behavior and to converge to a set of equilibria which satisfy an optimality condition. Definability is key to our proof: we show that for a more general class of nonsmooth functions, conservativity of the parametric optimality formula may fail, resulting in an absurd behavior of the ridge method.

Optimization and Control