No Arabic abstract
Optimization models with non-convex constraints arise in many tasks in machine learning, e.g., learning with fairness constraints or Neyman-Pearson classification with non-convex loss. Although many efficient methods have been developed with theoretical convergence guarantees for non-convex unconstrained problems, it remains a challenge to design provably efficient algorithms for problems with non-convex functional constraints. This paper proposes a class of subgradient methods for constrained optimization where the objective function and the constraint functions are are weakly convex. Our methods solve a sequence of strongly convex subproblems, where a proximal term is added to both the objective function and each constraint function. Each subproblem can be solved by various algorithms for strongly convex optimization. Under a uniform Slaters condition, we establish the computation complexities of our methods for finding a nearly stationary point.
First-order methods (FOMs) have been widely used for solving large-scale problems. A majority of existing works focus on problems without constraint or with simple constraints. Several recent works have studied FOMs for problems with complicated functional constraints. In this paper, we design a novel augmented Lagrangian (AL) based FOM for solving problems with non-convex objective and convex constraint functions. The new method follows the framework of the proximal point (PP) method. On approximately solving PP subproblems, it mixes the usage of the inexact AL method (iALM) and the quadratic penalty method, while the latter is always fed with estimated multipliers by the iALM. We show a complexity result of $O(varepsilon^{-frac{5}{2}}|logvarepsilon|)$ for the proposed method to achieve an $varepsilon$-KKT point. This is the best known result. Theoretically, the hybrid method has lower iteration-complexity requirement than its counterpart that only uses iALM to solve PP subproblems, and numerically, it can perform significantly better than a pure-penalty-based method. Numerical experiments are conducted on nonconvex linearly constrained quadratic programs and nonconvex QCQP. The numerical results demonstrate the efficiency of the proposed methods over existing ones.
In this paper, acceleration of gradient methods for convex optimization problems with weak levels of convexity and smoothness is considered. Starting from the universal fast gradient method which was designed to be an optimal method for weakly smooth problems whose gradients are Holder continuous, its momentum is modified appropriately so that it can also accommodate uniformly convex and weakly smooth problems. Differently from the existing works, fast gradient methods proposed in this paper do not use the restarting technique but use momentums that are suitably designed to reflect both the uniform convexity and the weak smoothness information of the target energy function. Both theoretical and numerical results that support the superiority of proposed methods are presented.
In this paper, an inexact proximal-point penalty method is studied for constrained optimization problems, where the objective function is non-convex, and the constraint functions can also be non-convex. The proposed method approximately solves a sequence of subproblems, each of which is formed by adding to the original objective function a proximal term and quadratic penalty terms associated to the constraint functions. Under a weak-convexity assumption, each subproblem is made strongly convex and can be solved effectively to a required accuracy by an optimal gradient-based method. The computational complexity of the proposed method is analyzed separately for the cases of convex constraint and non-convex constraint. For both cases, the complexity results are established in terms of the number of proximal gradient steps needed to find an $varepsilon$-stationary point. When the constraint functions are convex, we show a complexity result of $tilde O(varepsilon^{-5/2})$ to produce an $varepsilon$-stationary point under the Slaters condition. When the constraint functions are non-convex, the complexity becomes $tilde O(varepsilon^{-3})$ if a non-singularity condition holds on constraints and otherwise $tilde O(varepsilon^{-4})$ if a feasible initial solution is available.
We introduce a geometrically transparent strict saddle property for nonsmooth functions. This property guarantees that simple proximal algorithms on weakly convex problems converge only to local minimizers, when randomly initialized. We argue that the strict saddle property may be a realistic assumption in applications, since it provably holds for generic semi-algebraic optimization problems.
Conditional gradient methods have attracted much attention in both machine learning and optimization communities recently. These simple methods can guarantee the generation of sparse solutions. In addition, without the computation of full gradients, they can handle huge-scale problems sometimes even with an exponentially increasing number of decision variables. This paper aims to significantly expand the application areas of these methods by presenting new conditional gradient methods for solving convex optimization problems with general affine and nonlinear constraints. More specifically, we first present a new constraint extrapolated condition gradient (CoexCG) method that can achieve an ${cal O}(1/epsilon^2)$ iteration complexity for both smooth and structured nonsmooth function constrained convex optimization. We further develop novel variants of CoexCG, namely constraint extrapolated and dual regularized conditional gradient (CoexDurCG) methods, that can achieve similar iteration complexity to CoexCG but allow adaptive selection for algorithmic parameters. We illustrate the effectiveness of these methods for solving an important class of radiation therapy treatment planning problems arising from healthcare industry. To the best of our knowledge, all the algorithmic schemes and their complexity results are new in the area of projection-free methods.