No Arabic abstract
This work presents a new algorithm for empirical risk minimization. The algorithm bridges the gap between first- and second-order methods by computing a search direction that uses a second-order-type update in one subspace, coupled with a scaled steepest descent step in the orthogonal complement. To this end, partial curvature information is incorporated to help with ill-conditioning, while simultaneously allowing the algorithm to scale to the large problem dimensions often encountered in machine learning applications. Theoretical results are presented to confirm that the algorithm converges to a stationary point in both the strongly convex and nonconvex cases. A stochastic variant of the algorithm is also presented, along with corresponding theoretical guarantees. Numerical results confirm the strengths of the new approach on standard machine learning problems.
A sequential quadratic optimization algorithm is proposed for solving smooth nonlinear equality constrained optimization problems in which the objective function is defined by an expectation of a stochastic function. The algorithmic structure of the proposed method is based on a step decomposition strategy that is known in the literature to be widely effective in practice, wherein each search direction is computed as the sum of a normal step (toward linearized feasibility) and a tangential step (toward objective decrease in the null space of the constraint Jacobian). However, the proposed method is unique from others in the literature in that it both allows the use of stochastic objective gradient estimates and possesses convergence guarantees even in the setting in which the constraint Jacobians may be rank deficient. The results of numerical experiments demonstrate that the algorithm offers superior performance when compared to popular alternatives.
Sparse optimization is a central problem in machine learning and computer vision. However, this problem is inherently NP-hard and thus difficult to solve in general. Combinatorial search methods find the global optimal solution but are confined to small-sized problems, while coordinate descent methods are efficient but often suffer from poor local minima. This paper considers a new block decomposition algorithm that combines the effectiveness of combinatorial search methods and the efficiency of coordinate descent methods. Specifically, we consider a random strategy or/and a greedy strategy to select a subset of coordinates as the working set, and then perform a global combinatorial search over the working set based on the original objective function. We show that our method finds stronger stationary points than Amir Beck et al.s coordinate-wise optimization method. In addition, we establish the convergence rate of our algorithm. Our experiments on solving sparse regularized and sparsity constrained least squares optimization problems demonstrate that our method achieves state-of-the-art performance in terms of accuracy. For example, our method generally outperforms the well-known greedy pursuit method.
A gradient-free deterministic method is developed to solve global optimization problems for Lipschitz continuous functions defined in arbitrary path-wise connected compact sets in Euclidean spaces. The method can be regarded as granular sieving with synchronous analysis in both the domain and range of the objective function. With straightforward mathematical formulation applicable to both univariate and multivariate objective functions, the global minimum value and all the global minimizers are located through two decreasing sequences of compact sets in, respectively, the domain and range spaces. The algorithm is easy to implement with moderate computational cost. The method is tested against extensive benchmark functions in the literature. The experimental results show remarkable effectiveness and applicability of the algorithm.
We consider the minimization problem with the truncated quadratic regularization with gradient operator, which is a nonsmooth and nonconvex problem. We cooperated the classical preconditioned iterations for linear equations into the nonlinear difference of convex functions algorithms with extrapolation. Especially, our preconditioned framework can deal with the large linear system efficiently which is usually expensive for computations. Global convergence is guaranteed and local linear convergence rate is given based on the analysis of the Kurdyka-L ojasiewicz exponent of the minimization functional. The proposed algorithm with preconditioners turns out to be very efficient for image restoration and is also appealing for image segmentation.
In this paper, we develop a parameterized proximal point algorithm (P-PPA) for solving a class of separable convex programming problems subject to linear and convex constraints. The proposed algorithm is provable to be globally convergent with a worst-case O(1/t) convergence rate, wheret denotes the iteration number. By properly choosing the algorithm parameters, numerical experiments on solving a sparse optimization problem arising from statistical learning show that our P-PPA could perform significantly better than other state-of-the-art methods, such as the alternating direction method of multipliers and the relaxed proximal point algorithm.