Relatively-Smooth Convex Optimization by First-Order Methods, and Applications


Abstract in English

The usual approach to developing and analyzing first-order methods for smooth convex optimization assumes that the gradient of the objective function is uniformly smooth with some Lipschitz constant $L$. However, in many settings the differentiable convex function $f(cdot)$ is not uniformly smooth -- for example in $D$-optimal design where $f(x):=-ln det(HXH^T)$, or even the univariate setting with $f(x) := -ln(x) + x^2$. Herein we develop a notion of relative smoothness and relative strong convexity that is determined relative to a user-specified reference function $h(cdot)$ (that should be computationally tractable for algorithms), and we show that many differentiable convex functions are relatively smooth with respect to a correspondingly fairly-simple reference function $h(cdot)$. We extend two standard algorithms -- the primal gradient scheme and the dual averaging scheme -- to our new setting, with associated computational guarantees. We apply our new approach to develop a new first-order method for the $D$-optimal design problem, with associated computational complexity analysis. Some of our results have a certain overlap with the recent work cite{bbt}.

Download