Difference of convex algorithms for bilevel programs with applications in hyperparameter selection

173 0 0.0 ( 0 )

Download Cite

Added by Jane Ye

Publication date 2021

fields

and research's language is English

Authors Jane J. Ye - Xiaoming Yuan - Shangzhi Zeng

Optimization and Control

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

In this paper, we present difference of convex algorithms for solving bilevel programs in which the upper level objective functions are difference of convex functions, and the lower level programs are fully convex. This nontrivial class of bilevel programs provides a powerful modelling framework for dealing with applications arising from hyperparameter selection in machine learning. Thanks to the full convexity of the lower level program, the value function of the lower level program turns out to be convex and hence the bilevel program can be reformulated as a difference of convex bilevel program. We propose two algorithms for solving the reformulated difference of convex program and show their convergence under very mild assumptions. Finally we conduct numerical experiments to a bilevel model of support vector machine classification.

rate research

Directional necessary optimality conditions for bilevel programs

135 - Kuang Bai , Jane Ye 2020

The bilevel program is an optimization problem where the constraint involves solutions to a parametric optimization problem. It is well-known that the value function reformulation provides an equivalent single-level optimization problem but it results in a nonsmooth optimization problem which never satisfies the usual constraint qualification such as the Mangasarian-Fromovitz constraint qualification (MFCQ). In this paper we show that even the first order sufficient condition for metric subregularity (which is in general weaker than MFCQ) fails at each feasible point of the bilevel program. We introduce the concept of directional calmness condition and show that under {the} directional calmness condition, the directional necessary optimality condition holds. {While the directional optimality condition is in general sharper than the non-directional one,} the directional calmness condition is in general weaker than the classical calmness condition and hence is more likely to hold. {We perform the directional sensitivity analysis of the value function and} propose the directional quasi-normality as a sufficient condition for the directional calmness. An example is given to show that the directional quasi-normality condition may hold for the bilevel program.

Optimization and Control

Bilevel Polynomial Programs and Semidefinite Relaxation Methods

161 - Jiawang Nie , Li Wang , Jane Ye 2015

A bilevel program is an optimization problem whose constraints involve another optimization problem. This paper studies bilevel polynomial programs (BPPs), i.e., all the functions are polynomials. We reformulate BPPs equivalently as semi-infinite polynomial programs (SIPPs), using Fritz John conditions and Jacobian representations. Combining the exchange technique and Lasserre type semidefinite relaxations, we propose numerical methods for solving both simple and general BPPs. For simple BPPs, we prove the convergence to global optimal solutions. Numerical experiments are presented to show the efficiency of proposed algorithms.

Optimization and Control

Augmented Lagrangian based first-order methods for convex-constrained programs with weakly-convex objective

86 - Zichong Li , Yangyang Xu 2020

First-order methods (FOMs) have been widely used for solving large-scale problems. A majority of existing works focus on problems without constraint or with simple constraints. Several recent works have studied FOMs for problems with complicated functional constraints. In this paper, we design a novel augmented Lagrangian (AL) based FOM for solving problems with non-convex objective and convex constraint functions. The new method follows the framework of the proximal point (PP) method. On approximately solving PP subproblems, it mixes the usage of the inexact AL method (iALM) and the quadratic penalty method, while the latter is always fed with estimated multipliers by the iALM. We show a complexity result of $O(varepsilon^{-frac{5}{2}}|logvarepsilon|)$ for the proposed method to achieve an $varepsilon$-KKT point. This is the best known result. Theoretically, the hybrid method has lower iteration-complexity requirement than its counterpart that only uses iALM to solve PP subproblems, and numerically, it can perform significantly better than a pure-penalty-based method. Numerical experiments are conducted on nonconvex linearly constrained quadratic programs and nonconvex QCQP. The numerical results demonstrate the efficiency of the proposed methods over existing ones.

Optimization and Control Numerical Analysis Numerical Analysis

Primal-Dual Frank-Wolfe for Constrained Stochastic Programs with Convex and Non-convex Objectives

104 - Xiaohan Wei , Michael J. Neely 2018

We study constrained stochastic programs where the decision vector at each time slot cannot be chosen freely but is tied to the realization of an underlying random state vector. The goal is to minimize a general objective function subject to linear constraints. A typical scenario where such programs appear is opportunistic scheduling over a network of time-varying channels, where the random state vector is the channel state observed, and the control vector is the transmission decision which depends on the current channel state. We consider a primal-dual type Frank-Wolfe algorithm that has a low complexity update during each slot and that learns to make efficient decisions without prior knowledge of the probability distribution of the random state vector. We establish convergence time guarantees for the case of both convex and non-convex objective functions. We also emphasize application of the algorithm to non-convex opportunistic scheduling and distributed non-convex stochastic optimization over a connected graph.

Optimization and Control

Lifting convex inequalities for bipartite bilinear programs

99 - Xiaoyi Gu , Santanu S. Dey , Jean-Philippe P. Richard 2021

The goal of this paper is to derive new classes of valid convex inequalities for quadratically constrained quadratic programs (QCQPs) through the technique of lifting. Our first main result shows that, for sets described by one bipartite bilinear constraint together with bounds, it is always possible to sequentially lift a seed inequality that is valid for a restriction obtained by fixing variables to their bounds, when the lifting is accomplished using affine functions of the fixed variables. In this setting, sequential lifting involves solving a non-convex nonlinear optimization problem each time a variable is lifted, just as in Mixed Integer Linear Programming. To reduce the computational burden associated with this procedure, we develop a framework based on subadditive approximations of lifting functions that permits sequence-independent lifting of seed inequalities for separable bipartite bilinear sets. In particular, this framework permits the derivation of closed-form valid inequalities. We then study a separable bipartite bilinear set where the coefficients form a minimal cover with respect to the right-hand-side. For this set, we introduce a bilinear cover inequality, which is second-order cone representable. We argue that this bilinear cover inequality is strong by showing that it yields a constant-factor approximation of the convex hull of the original set. We study its lifting function and construct a two-slope subadditive upper bound. Using this subadditive approximation, we lift fixed variable pairs in closed-form, thus deriving a lifted bilinear cover inequality that is valid for general separable bipartite bilinear sets with box constraints.

Optimization and Control