Differentiating Through a Cone Program

68 0 0.0 ( 0 )

Download Cite

Added by Akshay Agrawal

Publication date 2019

fields

and research's language is English

Authors Akshay Agrawal - Shane Barratt - Stephen Boyd

Optimization and Control

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We consider the problem of efficiently computing the derivative of the solution map of a convex cone program, when it exists. We do this by implicitly differentiating the residual map for its homogeneous self-dual embedding, and solving the linear systems of equations required using an iterative method. This allows us to efficiently compute the derivative operator, and its adjoint, evaluated at a vector. These correspond to computing an approximate new solution, given a perturbation to the cone program coefficients (i.e., perturbation analysis), and to computing the gradient of a function of the solution with respect to the coefficients. Our method scales to large problems, with numbers of coefficients in the millions. We present an open-source Python implementation of our method that solves a cone program and returns the derivative and its adjoint as abstract linear maps; our implementation can be easily integrated into software systems for automatic differentiation.

rate research

Differentiating through Log-Log Convex Programs

212 - Akshay Agrawal , Stephen Boyd 2020

We show how to efficiently compute the derivative (when it exists) of the solution map of log-log convex programs (LLCPs). These are nonconvex, nonsmooth optimization problems with positive variables that become convex when the variables, objective functions, and constraint functions are replaced with their logs. We focus specifically on LLCPs generated by disciplined geometric programming, a grammar consisting of a set of atomic functions with known log-log curvature and a composition rule for combining them. We represent a parametrized LLCP as the composition of a smooth transformation of parameters, a convex optimization problem, and an exponential transformation of the convex optimization problems solution. The derivative of this composition can be computed efficiently, using recently developed methods for differentiating through convex optimization problems. We implement our method in CVXPY, a Python-embedded modeling language and rewriting system for convex optimization. In just a few lines of code, a user can specify a parametrized LLCP, solve it, and evaluate the derivative or its adjoint at a vector. This makes it possible to conduct sensitivity analyses of solutions, given perturbations to the parameters, and to compute the gradient of a function of the solution with respect to the parameters. We use the adjoint of the derivative to implement differentiable log-log convex optimization layers in PyTorch and TensorFlow. Finally, we present applications to designing queuing systems and fitting structured prediction models.

Optimization and Control

Differentiating through the Frechet Mean

186 - Aaron Lou , Isay Katsman , Qingxuan Jiang 2020

Recent advances in deep representation learning on Riemannian manifolds extend classical deep learning operations to better capture the geometry of the manifold. One possible extension is the Frechet mean, the generalization of the Euclidean mean; however, it has been difficult to apply because it lacks a closed form with an easily computable derivative. In this paper, we show how to differentiate through the Frechet mean for arbitrary Riemannian manifolds. Then, focusing on hyperbolic space, we derive explicit gradient expressions and a fast, accurate, and hyperparameter-free Frechet mean solver. This fully integrates the Frechet mean into the hyperbolic neural network pipeline. To demonstrate this integration, we present two case studies. First, we apply our Frechet mean to the existing Hyperbolic Graph Convolutional Network, replacing its projected aggregation to obtain state-of-the-art results on datasets with high hyperbolicity. Second, to demonstrate the Frechet means capacity to generalize Euclidean neural network operations, we develop a hyperbolic batch normalization method that gives an improvement parallel to the one observed in the Euclidean setting.

Machine Learning Machine Learning

Higher-Order Cone Programming

81 - Lijun Ding , Lek-Heng Lim 2018

We introduce a conic embedding condition that gives a hierarchy of cones and cone programs. This condition is satisfied by a large number of convex cones including the cone of copositive matrices, the cone of completely positive matrices, and all symmetric cones. We discuss properties of the intermediate cones and conic programs in the hierarchy. In particular, we demonstrate how this embedding condition gives rise to a family of cone programs that interpolates between LP, SOCP, and SDP. This family of $k$th order cones may be realized either as cones of $n$-by-$n$ symmetric matrices or as cones of $n$-variate even degree polynomials. The cases $k = 1, 2, n$ then correspond to LP, SOCP, SDP; or, in the language of polynomial optimization, to DSOS, SDSOS, SOS.

Optimization and Control

Differentiating a Tensor Language

110 - Gilbert Bernstein , Michael Mara , Tzu-Mao Li 2020

How does one compile derivatives of tensor programs, such that the resulting code is purely functional (hence easier to optimize and parallelize) and provably efficient relative to the original program? We show that naively differentiating tensor code---as done in popular systems like Tensorflow and PyTorch---can cause asymptotic slowdowns in pathological cases, violating the Cheap Gradients Principle. However, all existing automatic differentiation methods that guarantee this principle (for variable size data) do so by relying on += mutation through aliases/pointers---which complicates downstream optimization. We provide the first purely functional, provably efficient, adjoint/reverse-mode derivatives of array/tensor code by explicitly accounting for sparsity. We do this by focusing on the indicator function from Iversons APL. We also introduce a new Tensor SSA normal form and a new derivation of reverse-mode automatic differentiation based on the universal property of inner-products.

Programming Languages Graphics

Approximate Dynamic Programming via a Smoothed Linear Program

134 - V. V. Desai , V. F. Farias , C. C. Moallemi 2009

We present a novel linear program for the approximation of the dynamic programming cost-to-go function in high-dimensional stochastic control problems. LP approaches to approximate DP have typically relied on a natural `projection of a well studied linear program for exact dynamic programming. Such programs restrict attention to approximations that are lower bounds to the optimal cost-to-go function. Our program--the `smoothed approximate linear program--is distinct from such approaches and relaxes the restriction to lower bounding approximations in an appropriate fashion while remaining computationally tractable. Doing so appears to have several advantages: First, we demonstrate substantially superior bounds on the quality of approximation to the optimal cost-to-go function afforded by our approach. Second, experiments with our approach on a challenging problem (the game of Tetris) show that the approach outperforms the existing LP approach (which has previously been shown to be competitive with several ADP algorithms) by an order of magnitude.

Optimization and Control