Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Accelerating Stochastic Composition Optimization

92 0 0.0 ( 0 )

Download Cite

Added by Ji Liu

Publication date 2016

fields Mathematical Statistics

and research's language is English

Authors Mengdi Wang - Ji Liu - Ethan X. Fang

Optimization and Control Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Consider the stochastic composition optimization problem where the objective is a composition of two expected-value functions. We propose a new stochastic first-order method, namely the accelerated stochastic compositional proximal gradient (ASC-PG) method, which updates based on queries to the sampling oracle using two different timescales. The ASC-PG is the first proximal gradient method for the stochastic composition problem that can deal with nonsmooth regularization penalty. We show that the ASC-PG exhibits faster convergence than the best known algorithms, and that it achieves the optimal sample-error complexity in several important special cases. We further demonstrate the application of ASC-PG to reinforcement learning and conduct numerical experiments.

rate research

Multi-Level Stochastic Gradient Methods for Nested Composition Optimization

100 - Shuoguang Yang , Mengdi Wang , Ethan X. Fang 2018

Stochastic gradient methods are scalable for solving large-scale optimization problems that involve empirical expectations of loss functions. Existing results mainly apply to optimization problems where the objectives are one- or two-level expectations. In this paper, we consider the multi-level compositional optimization problem that involves compositions of multi-level component functions and nested expectations over a random path. It finds applications in risk-averse optimization and sequential planning. We propose a class of multi-level stochastic gradient methods that are motivated from the method of multi-timescale stochastic approximation. First we propose a basic $T$-level stochastic compositional gradient algorithm, establish its almost sure convergence and obtain an $n$-iteration error bound $O (n^{-1/2^T})$. Then we develop accelerated multi-level stochastic gradient methods by using an extrapolation-interpolation scheme to take advantage of the smoothness of individual component functions. When all component functions are smooth, we show that the convergence rate improves to $O(n^{-4/(7+T)})$ for general objectives and $O (n^{-4/(3+T)})$ for strongly convex objectives. We also provide almost sure convergence and rate of convergence results for nonconvex problems. The proposed methods and theoretical results are validated using numerical experiments.

Optimization and Control

Stochastic Variance Reduced Primal Dual Algorithms for Empirical Composition Optimization

94 - Adithya M. Devraj , Jianshu Chen 2019

We consider a generic empirical composition optimization problem, where there are empirical averages present both outside and inside nonlinear loss functions. Such a problem is of interest in various machine learning applications, and cannot be directly solved by standard methods such as stochastic gradient descent. We take a novel approach to solving this problem by reformulating the original minimization objective into an equivalent min-max objective, which brings out all the empirical averages that are originally inside the nonlinear loss functions. We exploit the rich structures of the reformulated problem and develop a stochastic primal-dual algorithm, SVRPDA-I, to solve the problem efficiently. We carry out extensive theoretical analysis of the proposed algorithm, obtaining the convergence rate, the computation complexity and the storage complexity. In particular, the algorithm is shown to converge at a linear rate when the problem is strongly convex. Moreover, we also develop an approximate version of the algorithm, named SVRPDA-II, which further reduces the memory requirement. Finally, we evaluate our proposed algorithms on several real-world benchmarks, and experimental results show that the proposed algorithms significantly outperform existing techniques.

Optimization and Control Machine Learning

Retrospective Approximation for Smooth Stochastic Optimization

297 - David Newton , Raghu Bollapragada , Raghu Pasupathy 2021

We consider stochastic optimization problems where a smooth (and potentially nonconvex) objective is to be minimized using a stochastic first-order oracle. These type of problems arise in many settings from simulation optimization to deep learning. We present Retrospective Approximation (RA) as a universal sequential sample-average approximation (SAA) paradigm where during each iteration $k$, a sample-path approximation problem is implicitly generated using an adapted sample size $M_k$, and solved (with prior solutions as warm start) to an adapted error tolerance $epsilon_k$, using a deterministic method such as the line search quasi-Newton method. The principal advantage of RA is that decouples optimization from stochastic approximation, allowing the direct adoption of existing deterministic algorithms without modification, thus mitigating the need to redesign algorithms for the stochastic context. A second advantage is the obvious manner in which RA lends itself to parallelization. We identify conditions on ${M_k, k geq 1}$ and ${epsilon_k, kgeq 1}$ that ensure almost sure convergence and convergence in $L_1$-norm, along with optimal iteration and work complexity rates. We illustrate the performance of RA with line-search quasi-Newton on an ill-conditioned least squares problem, as well as an image classification problem using a deep convolutional neural net.

Optimization and Control Machine Learning

A stochastic linearized proximal method of multipliers for convex stochastic optimization with expectation constraints

235 - Liwei Zhang , Yule Zhang , Jia Wu 2021

This paper considers the problem of minimizing a convex expectation function with a set of inequality convex expectation constraints. We present a computable stochastic approximation type algorithm, namely the stochastic linearized proximal method of multipliers, to solve this convex stochastic optimization problem. This algorithm can be roughly viewed as a hybrid of stochastic approximation and the traditional proximal method of multipliers. Under mild conditions, we show that this algorithm exhibits $O(K^{-1/2})$ expected convergence rates for both objective reduction and constraint violation if parameters in the algorithm are properly chosen, where $K$ denotes the number of iterations. Moreover, we show that, with high probability, the algorithm has $O(log(K)K^{-1/2})$ constraint violation bound and $O(log^{3/2}(K)K^{-1/2})$ objective bound. Some preliminary numerical results demonstrate the performance of the proposed algorithm.

Optimization and Control Machine Learning

Algorithms for stochastic optimization with functional or expectation constraints

85 - Guanghui Lan , Zhiqiang Zhou 2016

This paper considers the problem of minimizing an expectation function over a closed convex set, coupled with a {color{black} functional or expectation} constraint on either decision variables or problem parameters. We first present a new stochastic approximation (SA) type algorithm, namely the cooperative SA (CSA), to handle problems with the constraint on devision variables. We show that this algorithm exhibits the optimal ${cal O}(1/epsilon^2)$ rate of convergence, in terms of both optimality gap and constraint violation, when the objective and constraint functions are generally convex, where $epsilon$ denotes the optimality gap and infeasibility. Moreover, we show that this rate of convergence can be improved to ${cal O}(1/epsilon)$ if the objective and constraint functions are strongly convex. We then present a variant of CSA, namely the cooperative stochastic parameter approximation (CSPA) algorithm, to deal with the situation when the constraint is defined over problem parameters and show that it exhibits similar optimal rate of convergence to CSA. It is worth noting that CSA and CSPA are primal methods which do not require the iterations on the dual space and/or the estimation on the size of the dual variables. To the best of our knowledge, this is the first time that such optimal SA methods for solving functional or expectation constrained stochastic optimization are presented in the literature.

Optimization and Control Machine Learning

comments

Fetching comments

Higher Institute for Demographic Studies and Researches

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Accelerating Stochastic Composition Optimization

Ask ChatGPT about the research

No Arabic abstract

Read More