No Arabic abstract
We propose a randomized algorithm with quadratic convergence rate for convex optimization problems with a self-concordant, composite, strongly convex objective function. Our method is based on performing an approximate Newton step using a random projection of the Hessian. Our first contribution is to show that, at each iteration, the embedding dimension (or sketch size) can be as small as the effective dimension of the Hessian matrix. Leveraging this novel fundamental result, we design an algorithm with a sketch size proportional to the effective dimension and which exhibits a quadratic rate of convergence. This result dramatically improves on the classical linear-quadratic convergence rates of state-of-the-art sub-sampled Newton methods. However, in most practical cases, the effective dimension is not known beforehand, and this raises the question of how to pick a sketch size as small as the effective dimension while preserving a quadratic convergence rate. Our second and main contribution is thus to propose an adaptive sketch size algorithm with quadratic convergence rate and which does not require prior knowledge or estimation of the effective dimension: at each iteration, it starts with a small sketch size, and increases it until quadratic progress is achieved. Importantly, we show that the embedding dimension remains proportional to the effective dimension throughout the entire path and that our method achieves state-of-the-art computational complexity for solving convex optimization programs with a strongly convex component.
This paper studies an infinite horizon optimal control problem for discrete-time linear systems and quadratic criteria, both with random parameters which are independent and identically distributed with respect to time. A classical approach is to solve an algebraic Riccati equation that involves mathematical expectations and requires certain statistical information of the parameters. In this paper, we propose an online iterative algorithm in the spirit of Q-learning for the situation where only one random sample of parameters emerges at each time step. The first theorem proves the equivalence of three properties: the convergence of the learning sequence, the well-posedness of the control problem, and the solvability of the algebraic Riccati equation. The second theorem shows that the adaptive feedback control in terms of the learning sequence stabilizes the system as long as the control problem is well-posed. Numerical examples are presented to illustrate our results.
Communication compression techniques are of growing interests for solving the decentralized optimization problem under limited communication, where the global objective is to minimize the average of local cost functions over a multi-agent network using only local computation and peer-to-peer communication. In this paper, we first propose a novel compressed gradient tracking algorithm (C-GT) that combines gradient tracking technique with communication compression. In particular, C-GT is compatible with a general class of compression operators that unifies both unbiased and biased compressors. We show that C-GT inherits the advantages of gradient tracking-based algorithms and achieves linear convergence rate for strongly convex and smooth objective functions. In the second part of this paper, we propose an error feedback based compressed gradient tracking algorithm (EF-C-GT) to further improve the algorithm efficiency for biased compression operators. Numerical examples complement the theoretical findings and demonstrate the efficiency and flexibility of the proposed algorithms.
Information compression is essential to reduce communication cost in distributed optimization over peer-to-peer networks. This paper proposes a communication-efficient linearly convergent distributed (COLD) algorithm to solve strongly convex optimization problems. By compressing innovation vectors, which are the differences between decision vectors and their estimates, COLD is able to achieve linear convergence for a class of $delta$-contracted compressors. We explicitly quantify how the compression affects the convergence rate and show that COLD matches the same rate of its uncompressed version. To accommodate a wider class of compressors that includes the binary quantizer, we further design a novel dynamical scaling mechanism and obtain the linearly convergent Dyna-COLD. Importantly, our results strictly improve existing results for the quantized consensus problem. Numerical experiments demonstrate the advantages of both algorithms under different compressors.
It has been widely recognized that the 0/1 loss function is one of the most natural choices for modelling classification errors, and it has a wide range of applications including support vector machines and 1-bit compressed sensing. Due to the combinatorial nature of the 0/1 loss function, methods based on convex relaxations or smoothing approximations have dominated the existing research and are often able to provide approximate solutions of good quality. However, those methods are not optimizing the 0/1 loss function directly and hence no optimality has been established for the original problem. This paper aims to study the optimality conditions of the 0/1 function minimization, and for the first time to develop Newtons method that directly optimizes the 0/1 function with a local quadratic convergence under reasonable conditions. Extensive numerical experiments demonstrate its superior performance as one would expect from Newton-type methods.ions. Extensive numerical experiments demonstrate its superior performance as one would expect from Newton-type methods.
This paper studies asymptotic solvability of a linear quadratic (LQ) mean field social optimization problem with controlled diffusions and indefinite state and control weights. Starting with an $N$-agent model, we employ a rescaling approach to derive a low-dimensional Riccati ordinary differential equation (ODE) system, which characterizes a necessary and sufficient condition for asymptotic solvability. The decentralized control obtained from the mean field limit ensures a bounded optimality loss in minimizing the social cost having magnitude $O(N)$, which implies an optimality loss of $O(1/N)$ per agent. We further quantify the efficiency gain of the social optimum with respect to the solution of the mean field game.