No Arabic abstract
We study the convergence to equilibrium of an underdamped Langevin equation that is controlled by a linear feedback force. Specifically, we are interested in sampling the possibly multimodal invariant probability distribution of a Langevin system at small noise (or low temperature), for which the dynamics can easily get trapped inside metastable subsets of the phase space. We follow [Chen et al., J. Math. Phys. 56, 113302, 2015] and consider a Langevin equation that is simulated at a high temperature, with the control playing the role of a friction that balances the additional noise so as to restore the original invariant measure at a lower temperature. We discuss different limits as the temperature ratio goes to infinity and prove convergence to a limit dynamics. It turns out that, depending on whether the lower (target) or the higher (simulation) temperature is fixed, the controlled dynamics converges either to the overdamped Langevin equation or to a deterministic gradient flow. This implies that (a) the ergodic limit and the large temperature separation limit do not commute in general, and that (b) it is not possible to accelerate the speed of convergence to the ergodic limit by making the temperature separation larger and larger. We discuss the implications of these observation from the perspective of stochastic optimisation algorithms and enhanced sampling schemes in molecular dynamics.
Despite the strong theoretical guarantees that variance-reduced finite-sum optimization algorithms enjoy, their applicability remains limited to cases where the memory overhead they introduce (SAG/SAGA), or the periodic full gradient computation they require (SVRG/SARAH) are manageable. A promising approach to achieving variance reduction while avoiding these drawbacks is the use of importance sampling instead of control variates. While many such methods have been proposed in the literature, directly proving that they improve the convergence of the resulting optimization algorithm has remained elusive. In this work, we propose an importance-sampling-based algorithm we call SRG (stochastic reweighted gradient). We analyze the convergence of SRG in the strongly-convex case and show that, while it does not recover the linear rate of control variates methods, it provably outperforms SGD. We pay particular attention to the time and memory overhead of our proposed method, and design a specialized red-black tree allowing its efficient implementation. Finally, we present empirical results to support our findings.
Conditional Stochastic Optimization (CSO) covers a variety of applications ranging from meta-learning and causal inference to invariant learning. However, constructing unbiased gradient estimates in CSO is challenging due to the composition structure. As an alternative, we propose a biased stochastic gradient descent (BSGD) algorithm and study the bias-variance tradeoff under different structural assumptions. We establish the sample complexities of BSGD for strongly convex, convex, and weakly convex objectives, under smooth and non-smooth conditions. We also provide matching lower bounds of BSGD for convex CSO objectives. Extensive numerical experiments are conducted to illustrate the performance of BSGD on robust logistic regression, model-agnostic meta-learning (MAML), and instrumental variable regression (IV).
We establish a convergence theorem for a certain type of stochastic gradient descent, which leads to a convergent variant of the back-propagation algorithm
In this paper we propose several adaptive gradient methods for stochastic optimization. Unlike AdaGrad-type of methods, our algorithms are based on Armijo-type line search and they simultaneously adapt to the unknown Lipschitz constant of the gradient and variance of the stochastic approximation for the gradient. We consider an accelerated and non-accelerated gradient descent for convex problems and gradient descent for non-convex problems. In the experiments we demonstrate superiority of our methods to existing adaptive methods, e.g. AdaGrad and Adam.
In this paper we study a Markovian two-dimensional bounded-variation stochastic control problem whose state process consists of a diffusive mean-reverting component and of a purely controlled one. The main problems characteristic lies in the interaction of the two components of the state process: the mean-reversion level of the diffusive component is an affine function of the current value of the purely controlled one. By relying on a combination of techniques from viscosity theory and free-boundary analysis, we provide the structure of the value function and we show that it satisfies a second-order smooth-fit principle. Such a regularity is then exploited in order to determine a system of functional equations solved by the two monotone continuous curves (free boundaries) that split the control problems state space in three connected regions. Further properties of the free boundaries are also obtained.