No Arabic abstract
We present a novel algorithm that allows us to gain detailed insight into the effects of sparsity in linear and nonlinear optimization, which is of great importance in many scientific areas such as image and signal processing, medical imaging, compressed sensing, and machine learning (e.g., for the training of neural networks). Sparsity is an important feature to ensure robustness against noisy data, but also to find models that are interpretable and easy to analyze due to the small number of relevant terms. It is common practice to enforce sparsity by adding the $ell_1$-norm as a weighted penalty term. In order to gain a better understanding and to allow for an informed model selection, we directly solve the corresponding multiobjective optimization problem (MOP) that arises when we minimize the main objective and the $ell_1$-norm simultaneously. As this MOP is in general non-convex for nonlinear objectives, the weighting method will fail to provide all optimal compromises. To avoid this issue, we present a continuation method which is specifically tailored to MOPs with two objective functions one of which is the $ell_1$-norm. Our method can be seen as a generalization of well-known homotopy methods for linear regression problems to the nonlinear case. Several numerical examples - including neural network training - demonstrate our theoretical findings and the additional insight that can be gained by this multiobjective approach.
The paper addresses the problem of low-rank trace norm minimization. We propose an algorithm that alternates between fixed-rank optimization and rank-one updates. The fixed-rank optimization is characterized by an efficient factorization that makes the trace norm differentiable in the search space and the computation of duality gap numerically tractable. The search space is nonlinear but is equipped with a particular Riemannian structure that leads to efficient computations. We present a second-order trust-region algorithm with a guaranteed quadratic rate of convergence. Overall, the proposed optimization scheme converges super-linearly to the global solution while maintaining complexity that is linear in the number of rows and columns of the matrix. To compute a set of solutions efficiently for a grid of regularization parameters we propose a predictor-corrector approach that outperforms the naive warm-restart approach on the fixed-rank quotient manifold. The performance of the proposed algorithm is illustrated on problems of low-rank matrix completion and multivariate linear regression.
Inverse multiobjective optimization provides a general framework for the unsupervised learning task of inferring parameters of a multiobjective decision making problem (DMP), based on a set of observed decisions from the human expert. However, the performance of this framework relies critically on the availability of an accurate DMP, sufficient decisions of high quality, and a parameter space that contains enough information about the DMP. To hedge against the uncertainties in the hypothetical DMP, the data, and the parameter space, we investigate in this paper the distributionally robust approach for inverse multiobjective optimization. Specifically, we leverage the Wasserstein metric to construct a ball centered at the empirical distribution of these decisions. We then formulate a Wasserstein distributionally robust inverse multiobjective optimization problem (WRO-IMOP) that minimizes a worst-case expected loss function, where the worst case is taken over all distributions in the Wasserstein ball. We show that the excess risk of the WRO-IMOP estimator has a sub-linear convergence rate. Furthermore, we propose the semi-infinite reformulations of the WRO-IMOP and develop a cutting-plane algorithm that converges to an approximate solution in finite iterations. Finally, we demonstrate the effectiveness of our method on both a synthetic multiobjective quadratic program and a real world portfolio optimization problem.
In this paper, we propose some new proximal quasi-Newton methods with line search or without line search for a special class of nonsmooth multiobjective optimization problems, where each objective function is the sum of a twice continuously differentiable strongly convex function and a proper convex but not necessarily differentiable function. In these new proximal quasi-Newton methods, we approximate the Hessian matrices by using the well known BFGS, self-scaling BFGS, and the Huang BFGS method. We show that each accumulation point of the sequence generated by these new algorithms is a Pareto stationary point of the multiobjective optimization problem. In addition, we give their applications in robust multiobjective optimization, and we show that the subproblems of proximal quasi-Newton algorithms can be regarded as quadratic programming problems. Numerical experiments are carried out to verify the effectiveness of the proposed method.
In this article we develop a gradient-based algorithm for the solution of multiobjective optimization problems with uncertainties. To this end, an additional condition is derived for the descent direction in order to account for inaccuracies in the gradients and then incorporated in a subdivison algorithm for the computation of global solutions to multiobjective optimization problems. Convergence to a superset of the Pareto set is proved and an upper bound for the maximal distance to the set of substationary points is given. Besides the applicability to problems with uncertainties, the algorithm is developed with the intention to use it in combination with model order reduction techniques in order to efficiently solve PDE-constrained multiobjective optimization problems.
Nonsmooth sparsity constrained optimization captures a broad spectrum of applications in machine learning and computer vision. However, this problem is NP-hard in general. Existing solutions to this problem suffer from one or more of the following limitations: they fail to solve general nonsmooth problems; they lack convergence analysis; they lead to weaker optimality conditions. This paper revisits the Penalty Alternating Direction Method (PADM) for nonsmooth sparsity constrained optimization problems. We consider two variants of the PADM, i.e., PADM based on Iterative Hard Thresholding (PADM-IHT) and PADM based on Block Coordinate Decomposition (PADM-BCD). We show that the PADM-BCD algorithm finds stronger stationary points of the optimization problem than previous methods. We also develop novel theories to analyze the convergence rate for both the PADM-IHT and the PADM-BCD algorithms. Our theoretical bounds can exploit the inherent sparsity of the optimization problem. Finally, numerical results demonstrate the superiority of PADM-BCD to existing sparse optimization algorithms. Keywords: Sparsity Recovery, Nonsmooth Optimization, Non-Convex Optimization, Block Coordinate Decomposition, Iterative Hard Thresholding, Convergence Analysis