Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis

81 0 0.0 ( 0 )

Download Cite

Added by Nicolas Langren\\'e

Publication date 2018

fields

and research's language is English

Authors C^ome Hure

Probability Optimization and Control Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

This paper develops algorithms for high-dimensional stochastic control problems based on deep learning and dynamic programming. Unlike classical approximate dynamic programming approaches, we first approximate the optimal policy by means of neural networks in the spirit of deep reinforcement learning, and then the value function by Monte Carlo regression. This is achieved in the dynamic programming recursion by performance or hybrid iteration, and regress now methods from numerical probabilities. We provide a theoretical justification of these algorithms. Consistency and rate of convergence for the control and value function estimates are analyzed and expressed in terms of the universal approximation error of the neural networks, and of the statistical error when estimating network function, leaving aside the optimization error. Numerical results on various applications are presented in a companion paper (arxiv.org/abs/1812.05916) and illustrate the performance of the proposed algorithms.

rate research

Stochastic Primal-Dual Algorithms with Faster Convergence than $O(1/sqrt{T})$ for Problems without Bilinear Structure

86 - Yan Yan , Yi Xu , Qihang Lin 2019

Previous studies on stochastic primal-dual algorithms for solving min-max problems with faster convergence heavily rely on the bilinear structure of the problem, which restricts their applicability to a narrowed range of problems. The main contribution of this paper is the design and analysis of new stochastic primal-dual algorithms that use a mixture of stochastic gradient updates and a logarithmic number of deterministic dual updates for solving a family of convex-concave problems with no bilinear structure assumed. Faster convergence rates than $O(1/sqrt{T})$ with $T$ being the number of stochastic gradient updates are established under some mild conditions of involved functions on the primal and the dual variable. For example, for a family of problems that enjoy a weak strong convexity in terms of the primal variable and has a strongly concave function of the dual variable, the convergence rate of the proposed algorithm is $O(1/T)$. We also investigate the effectiveness of the proposed algorithms for learning robust models and empirical AUC maximization.

Machine Learning Optimization and Control Machine Learning

Stability for Receding-horizon Stochastic Model Predictive Control

450 - Joel A. Paulson , Stefan Streif , 2014

A stochastic model predictive control (SMPC) approach is presented for discrete-time linear systems with arbitrary time-invariant probabilistic uncertainties and additive Gaussian process noise. Closed-loop stability of the SMPC approach is established by appropriate selection of the cost function. Polynomial chaos is used for uncertainty propagation through system dynamics. The performance of the SMPC approach is demonstrated using the Van de Vusse reactions.

Systems and Control Optimization and Control

Optimal stopping time on semi-Markov processes with finite horizon

146 - Fang Chen , Xianping Guo , Zhong-Wei Liao 2021

In this paper, we consider the optimal stopping problem on semi-Markov processes (SMPs) with finite horizon, and aim to establish the existence and computation of optimal stopping times. To achieve the goal, we first develop the main results of finite horizon semi-Markov decision processes (SMDPs) to the case with additional terminal costs, introduce an explicit construction of SMDPs, and prove the equivalence between the optimal stopping problems on SMPs and SMDPs. Then, using the equivalence and the results on SMDPs developed here, we not only show the existence of optimal stopping time of SMPs, but also provide an algorithm for computing optimal stopping time on SMPs. Moreover, we show that the optimal and -optimal stopping time can be characterized by the hitting time of some special sets, respectively.

Probability Optimization and Control

Asymptotic convergence rate of Dropout on shallow linear neural networks

101 - Albert Senen-Cerda , Jaron Sanders 2020

We analyze the convergence rate of gradient flows on objective functions induced by Dropout and Dropconnect, when applying them to shallow linear Neural Networks (NNs) - which can also be viewed as doing matrix factorization using a particular regularizer. Dropout algorithms such as these are thus regularization techniques that use 0,1-valued random variables to filter weights during training in order to avoid coadaptation of features. By leveraging a recent result on nonconvex optimization and conducting a careful analysis of the set of minimizers as well as the Hessian of the loss function, we are able to obtain (i) a local convergence proof of the gradient flow and (ii) a bound on the convergence rate that depends on the data, the dropout probability, and the width of the NN. Finally, we compare this theoretical bound to numerical simulations, which are in qualitative agreement with the convergence bound and match it when starting sufficiently close to a minimizer.

Machine Learning Optimization and Control Machine Learning

On the Global Convergence of Majorization Minimization Algorithms for Nonconvex Optimization Problems

527 - Yangyang Kang , Zhihua Zhang , Wu-Jun Li 2015

In this paper, we study the global convergence of majorization minimization (MM) algorithms for solving nonconvex regularized optimization problems. MM algorithms have received great attention in machine learning. However, when applied to nonconvex optimization problems, the convergence of MM algorithms is a challenging issue. We introduce theory of the Kurdyka- Lojasiewicz inequality to address this issue. In particular, we show that many nonconvex problems enjoy the Kurdyka- Lojasiewicz property and establish the global convergence result of the corresponding MM procedure. We also extend our result to a well known method that called CCCP (concave-convex procedure).

Numerical Analysis Optimization and Control

comments

Fetching comments

Tartous University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Deep neural networks algorithms for stochastic control problems on finite horizon: convergence analysis

Ask ChatGPT about the research

No Arabic abstract

Read More