No Arabic abstract
We study novel robust zero-order algorithms with acceleration for the solution of real-time optimization problems. In particular, we propose a family of extremum seeking dynamics that can be universally modeled as singularly perturbed hybrid dynamical systems with restarting mechanisms. From this family of dynamics, we synthesize four fast algorithms for the solution of convex, strongly convex, constrained, and unconstrained optimization problems. In each case, we establish robust semi-global practical asymptotic or exponential stability results, and we show how to obtain well-posed discretized algorithms that retain the main properties of the original dynamics. Given that existing averaging theorems for singularly perturbed hybrid systems are not directly applicable to our setting, we derive a new averaging theorem that relaxes some of the assumptions made in the literature, allowing us to make a clear link between the KL bounds that characterize the rates of convergence of the hybrid dynamics and their average dynamics. We also show that our results are applicable to non-hybrid algorithms, thus providing a general framework for accelerated dynamics based on averaging theory. We present different numerical examples to illustrate our results.
A collection of optimization problems central to power system operation requires distributed solution architectures to avoid the need for aggregation of all information at a central location. In this paper, we study distributed dual subgradient methods to solve three such optimization problems. Namely, these are tie-line scheduling in multi-area power systems, coordination of distributed energy resources in radial distribution networks, and joint dispatch of transmission and distribution assets. With suitable relaxations or approximations of the power flow equations, all three problems can be reduced to a multi-agent constrained convex optimization problem. We utilize a constant step-size dual subgradient method with averaging on these problems. For this algorithm, we provide a convergence guarantee that is shown to be order-optimal. We illustrate its application on the grid optimization problems.
In this paper, we consider a stochastic distributed nonconvex optimization problem with the cost function being distributed over $n$ agents having access only to zeroth-order (ZO) information of the cost. This problem has various machine learning applications. As a solution, we propose two distributed ZO algorithms, in which at each iteration each agent samples the local stochastic ZO oracle at two points with an adaptive smoothing parameter. We show that the proposed algorithms achieve the linear speedup convergence rate $mathcal{O}(sqrt{p/(nT)})$ for smooth cost functions and $mathcal{O}(p/(nT))$ convergence rate when the global cost function additionally satisfies the Polyak--Lojasiewicz (P--L) condition, where $p$ and $T$ are the dimension of the decision variable and the total number of iterations, respectively. To the best of our knowledge, this is the first linear speedup result for distributed ZO algorithms, which enables systematic processing performance improvements by adding more agents. We also show that the proposed algorithms converge linearly when considering deterministic centralized optimization problems under the P--L condition. We demonstrate through numerical experiments the efficiency of our algorithms on generating adversarial examples from deep neural networks in comparison with baseline and recently proposed centralized and distributed ZO algorithms.
We propose a novel second-order ODE as the continuous-time limit of a Riemannian accelerated gradient-based method on a manifold with curvature bounded from below. This ODE can be seen as a generalization of the ODE derived for Euclidean spaces, and can also serve as an analysis tool. We study the convergence behavior of this ODE for different classes of functions, such as geodesically convex, strongly-convex and weakly-quasi-convex. We demonstrate how such an ODE can be discretized using a semi-implicit and Nesterov-inspired numerical integrator, that empirically yields stable algorithms which are faithful to the continuous-time analysis and exhibit accelerated convergence.
We study gradient-based optimization methods obtained by direct Runge-Kutta discretization of the ordinary differential equation (ODE) describing the movement of a heavy-ball under constant friction coefficient. When the function is high order smooth and strongly convex, we show that directly simulating the ODE with known numerical integrators achieve acceleration in a nontrivial neighborhood of the optimal solution. In particular, the neighborhood can grow larger as the condition number of the function increases. Furthermore, our results also hold for nonconvex but quasi-strongly convex objectives. We provide numerical experiments that verify the theoretical rates predicted by our results.
We introduce a new framework for unifying and systematizing the performance analysis of first-order black-box optimization algorithms for unconstrained convex minimization. The low-cost iteration complexity enjoyed by first-order algorithms renders them particularly relevant for applications in machine learning and large-scale data analysis. Relying on sum-of-squares (SOS) optimization, we introduce a hierarchy of semidefinite programs that give increasingly better convergence bounds for higher levels of the hierarchy. Alluding to the power of the SOS hierarchy, we show that the (dual of the) first level corresponds to the Performance Estimation Problem (PEP) introduced by Drori and Teboulle [Math. Program., 145(1):451--482, 2014], a powerful framework for determining convergence rates of first-order optimization algorithms. Consequently, many results obtained within the PEP framework can be reinterpreted as degree-1 SOS proofs, and thus, the SOS framework provides a promising new approach for certifying improved rates of convergence by means of higher-order SOS certificates. To determine analytical rate bounds, in this work we use the first level of the SOS hierarchy and derive new result{s} for noisy gradient descent with inexact line search methods (Armijo, Wolfe, and Goldstein).