ﻻ يوجد ملخص باللغة العربية
We analyze the DQN reinforcement learning algorithm as a stochastic approximation scheme using the o.d.e. (for ordinary differential equation) approach and point out certain theoretical issues. We then propose a modified scheme called Full Gradient DQN (FG-DQN, for short) that has a sound theoretical basis and compare it with the original scheme on sample problems. We observe a better performance for FG-DQN.
We propose a novel hybrid stochastic policy gradient estimator by combining an unbiased policy gradient estimator, the REINFORCE estimator, with another biased one, an adapted SARAH estimator for policy optimization. The hybrid policy gradient estima
We study reinforcement learning (RL) with linear function approximation under the adaptivity constraint. We consider two popular limited adaptivity models: batch learning model and rare policy switch model, and propose two efficient online RL algorit
One of the mysteries in the success of neural networks is randomly initialized first order methods like gradient descent can achieve zero training loss even though the objective function is non-convex and non-smooth. This paper demystifies this surpr
We present theoretical results on the convergence of emph{non-convex} accelerated gradient descent in matrix factorization models with $ell_2$-norm loss. The purpose of this work is to study the effects of acceleration in non-convex settings, where p
Artificial neural networks (ANNs) are typically highly nonlinear systems which are finely tuned via the optimization of their associated, non-convex loss functions. Typically, the gradient of any such loss function fails to be dissipative making the