No Arabic abstract
Quadratic optimization problems (QPs) are ubiquitous, and solution algorithms have matured to a reliable technology. However, the precision of solutions is usually limited due to the underlying floating-point operations. This may cause inconveniences when solutions are used for rigorous reasoning. We contribute on three levels to overcome this issue. First, we present a novel refinement algorithm to solve QPs to arbitrary precision. It iteratively solves refined QPs, assuming a floating-point QP solver oracle. We prove linear convergence of residuals and primal errors. Second, we provide an efficient implementation, based on SoPlex and qpOASES that is publicly available in source code. Third, we give precise reference solutions for the Maros and Meszaros benchmark library.
An important method to optimize a function on standard simplex is the active set algorithm, which requires the gradient of the function to be projected onto a hyperplane, with sign constraints on the variables that lie in the boundary of the simplex. We propose a new algorithm to efficiently project the gradient for this purpose. Furthermore, we apply the proposed gradient projection method to quadratic programs (QP) with standard simplex constraints, where gradient projection is used to explore the feasible region and, when we believe the optimal active set is identified, we switch to constrained conjugate gradient to accelerate convergence. Specifically, two different directions of gradient projection are used to explore the simplex, namely, the projected gradient and the reduced gradient. We choose one of the two directions according to the angle between the directions. Moreover, we propose two conditions for guessing the optimal active set heuristically. The first condition is that the working set remains unchanged for many iterations, and the second condition is that the angle between the projected gradient and the reduced gradient is small enough. Based on these strategies, a new active set algorithm for solving quadratic programs on standard simplex is proposed.
Mixed Integer Programming (MIP) solvers rely on an array of sophisticated heuristics developed with decades of research to solve large-scale MIP instances encountered in practice. Machine learning offers to automatically construct better heuristics from data by exploiting shared structure among instances in the data. This paper applies learning to the two key sub-tasks of a MIP solver, generating a high-quality joint variable assignment, and bounding the gap in objective value between that assignment and an optimal one. Our approach constructs two corresponding neural network-based components, Neural Diving and Neural Branching, to use in a base MIP solver such as SCIP. Neural Diving learns a deep neural network to generate multiple partial assignments for its integer variables, and the resulting smaller MIPs for un-assigned variables are solved with SCIP to construct high quality joint assignments. Neural Branching learns a deep neural network to make variable selection decisions in branch-and-bound to bound the objective value gap with a small tree. This is done by imitating a new variant of Full Strong Branching we propose that scales to large instances using GPUs. We evaluate our approach on six diverse real-world datasets, including two Google production datasets and MIPLIB, by training separate neural networks on each. Most instances in all the datasets combined have $10^3-10^6$ variables and constraints after presolve, which is significantly larger than previous learning approaches. Comparing solvers with respect to primal-dual gap averaged over a held-out set of instances, the learning-augmented SCIP is 2x to 10x better on all datasets except one on which it is $10^5$x better, at large time limits. To the best of our knowledge, ours is the first learning approach to demonstrate such large improvements over SCIP on both large-scale real-world application datasets and MIPLIB.
Motivated by a growing list of nontraditional statistical estimation problems of the piecewise kind, this paper provides a survey of known results supplemented with new results for the class of piecewise linear-quadratic programs. These are linearly constrained optimization problems with piecewise linear-quadratic (PLQ) objective functions. Starting from a study of the representation of such a function in terms of a family of elementary functions consisting of squared affine functions, squared plus-composite-affine functions, and affine functions themselves, we summarize some local properties of a PLQ function in terms of their first and second-order directional derivatives. We extend some well-known necessary and sufficient second-order conditions for local optimality of a quadratic program to a PLQ program and provide a dozen such equivalent conditions for strong, strict, and isolated local optimality, showing in particular that a PLQ program has the same characterizations for local minimality as a standard quadratic program. As a consequence of one such condition, we show that the number of strong, strict, or isolated local minima of a PLQ program is finite; this result supplements a recent result about the finite number of directional stationary objective values. Interestingly, these finiteness results can be uncovered by invoking a very powerful property of subanalytic functions; our proof is fairly elementary, however. We discuss applications of PLQ programs in some modern statistical estimation problems. These problems lead to a special class of unconstrained composite programs involving the non-differentiable $ell_1$-function, for which we show that the task of verifying the second-order stationary condition can be converted to the problem of checking the copositivity of certain Schur complement on the nonnegative orthant.
This paper studies a structured compound stochastic program (SP) involving multiple expectations coupled by nonconvex and nonsmooth functions. We present a successive convex-programming based sampling algorithm and establish its subsequential convergence. We describe stationarity properties of the limit points for several classes of the compound SP. We further discuss probabilistic stopping rules based on the computable error-bound for the algorithm. We present several risk measure minimization problems that can be formulated as such a compound stochastic program; these include generalized deviation optimization problems based on optimized certainty equivalent and buffered probability of exceedance (bPOE), a distributionally robust bPOE optimization problem, and a multiclass classification problem employing the cost-sensitive error criteria with bPOE risk measure.
This paper studies a strategy for data-driven algorithm design for large-scale combinatorial optimization problems that can leverage existing state-of-the-art solvers in general purpose ways. The goal is to arrive at new approaches that can reliably outperform existing solvers in wall-clock time. We focus on solving integer programs, and ground our approach in the large neighborhood search (LNS) paradigm, which iteratively chooses a subset of variables to optimize while leaving the remainder fixed. The appeal of LNS is that it can easily use any existing solver as a subroutine, and thus can inherit the benefits of carefully engineered heuristic or complete approaches and their software implementations. We show that one can learn a good neighborhood selector using imitation and reinforcement learning techniques. Through an extensive empirical validation in bounded-time optimization, we demonstrate that our LNS framework can significantly outperform compared to state-of-the-art commercial solvers such as Gurobi.