Nonsmooth optimization using Taylor-like models: error bounds, convergence, and termination criteria

60 0 0.0 ( 0 )

Download Cite

Added by Dmitriy Drusvyatskiy

Publication date 2016

fields

and research's language is English

Authors Dmitriy Drusvyatskiy - Alexander D. Ioffe - Adrian S. Lewis

Optimization and Control

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We consider optimization algorithms that successively minimize simple Taylor-like models of the objective function. Methods of Gauss-Newton type for minimizing the composition of a convex function and a smooth map are common examples. Our main result is an explicit relationship between the step-size of any such algorithm and the slope of the function at a nearby point. Consequently, we (1) show that the step-sizes can be reliably used to terminate the algorithm, (2) prove that as long as the step-sizes tend to zero, every limit point of the iterates is stationary, and (3) show that conditions, akin to classical quadratic growth, imply that the step-sizes linearly bound the distance of the iterates to the solution set. The latter so-called error bound property is typically used to establish linear (or faster) convergence guarantees. Analogous results hold when the step-size is replaced by the square root of the decrease in the models value. We complete the paper with extensions to when the models are minimized only inexactly.

rate research

On the Convergence Rate of Stochastic Mirror Descent for Nonsmooth Nonconvex Optimization

159 - Siqi Zhang , Niao He 2018

In this paper, we investigate the non-asymptotic stationary convergence behavior of Stochastic Mirror Descent (SMD) for nonconvex optimization. We focus on a general class of nonconvex nonsmooth stochastic optimization problems, in which the objective can be decomposed into a relatively weakly convex function (possibly non-Lipschitz) and a simple non-smooth convex regularizer. We prove that SMD, without the use of mini-batch, is guaranteed to converge to a stationary point in a convergence rate of $ mathcal{O}(1/sqrt{t}) $. The efficiency estimate matches with existing results for stochastic subgradient method, but is evaluated under a stronger stationarity measure. Our convergence analysis applies to both the original SMD and its proximal version, as well as the deterministic variants, for solving relatively weakly convex problems.

Optimization and Control

Error bounds for monomial convexification in polynomial optimization

102 - Warren Adams , Akshay Gupte , 2017

Convex hulls of monomials have been widely studied in the literature, and monomial convexifications are implemented in global optimization software for relaxing polynomials. However, there has been no study of the error in the global optimum from such approaches. We give bounds on the worst-case error for convexifying a monomial over subsets of $[0,1]^n$. This implies additive error bounds for relaxing a polynomial optimization problem by convexifying each monomial separately. Our main error bounds depend primarily on the degree of the monomial, making them easy to compute. Since monomial convexification studies depend on the bounds on the associated variables, in the second part, we conduct an error analysis for a multilinear monomial over two different types of box constraints. As part of this analysis, we also derive the convex hull of a multilinear monomial over $[-1,1]^n$.

Optimization and Control

BFGS convergence to nonsmooth minimizers of convex functions

157 - Jiayi Guo , Adrian Lewis 2017

The popular BFGS quasi-Newton minimization algorithm under reasonable conditions converges globally on smooth convex functions. This result was proved by Powell in 1976: we consider its implications for functions that are not smooth. In particular, an analogous convergence result holds for functions, like the Euclidean norm, that are nonsmooth at the minimizer.

Optimization and Control

Graph Oracle Models, Lower Bounds, and Gaps for Parallel Stochastic Optimization

182 - Blake Woodworth , Jialei Wang , Adam Smith 2018

We suggest a general oracle-based framework that captures different parallel stochastic optimization settings described by a dependency graph, and derive generic lower bounds in terms of this graph. We then use the framework and derive lower bounds for several specific parallel optimization settings, including delayed updates and parallel processing with intermittent communication. We highlight gaps between lower and upper bounds on the oracle complexity, and cases where the natural algorithms are not known to be optimal.

Optimization and Control Machine Learning Machine Learning

Stochastic Block Mirror Descent Methods for Nonsmooth and Stochastic Optimization

417 - Cong D. Dang , Guanghui Lan 2013

In this paper, we present a new stochastic algorithm, namely the stochastic block mirror descent (SBMD) method for solving large-scale nonsmooth and stochastic optimization problems. The basic idea of this algorithm is to incorporate the block-coordinate decomposition and an incremental block averaging scheme into the classic (stochastic) mirror-descent method, in order to significantly reduce the cost per iteration of the latter algorithm. We establish the rate of convergence of the SBMD method along with its associated large-deviation results for solving general nonsmooth and stochastic optimization problems. We also introduce different variants of this method and establish their rate of convergence for solving strongly convex, smooth, and composite optimization problems, as well as certain nonconvex optimization problems. To the best of our knowledge, all these developments related to the SBMD methods are new in the stochastic optimization literature. Moreover, some of our results also seem to be new for block coordinate descent methods for deterministic optimization.

Optimization and Control