ﻻ يوجد ملخص باللغة العربية
In a recent joint work, the author has developed a modification of Newtons method, named New Q-Newtons method, which can avoid saddle points and has quadratic rate of convergence. While good theoretical convergence guarantee has not been established for this method, experiments on small scale problems show that the method works very competitively against other well known modifications of Newtons method such as Adaptive Cubic Regularization and BFGS, as well as first order methods such as Unbounded Two-way Backtracking Gradient Descent. In this paper, we resolve the convergence guarantee issue by proposing a modification of New Q-Newtons method, named New Q-Newtons method Backtracking, which incorporates a more sophisticated use of hyperparameters and a Backtracking line search. This new method has very good theoretical guarantees, which for a {bf Morse function} yields the following (which is unknown for New Q-Newtons method): {bf Theorem.} Let $f:mathbb{R}^mrightarrow mathbb{R}$ be a Morse function, that is all its critical points have invertible Hessian. Then for a sequence ${x_n}$ constructed by New Q-Newtons method Backtracking from a random initial point $x_0$, we have the following two alternatives: i) $lim_{nrightarrowinfty}||x_n||=infty$, or ii) ${x_n}$ converges to a point $x_{infty}$ which is a {bf local minimum} of $f$, and the rate of convergence is {bf quadratic}. Moreover, if $f$ has compact sublevels, then only case ii) happens. As far as we know, for Morse functions, this is the best theoretical guarantee for iterative optimization algorithms so far in the literature. We have tested in experiments on small scale, with some further simplifie
We propose in this paper New Q-Newtons method. The update rule is very simple conceptually, for example $x_{n+1}=x_n-w_n$ where $w_n=pr_{A_n,+}(v_n)-pr_{A_n,-}(v_n)$, with $A_n= abla ^2f(x_n)+delta _n|| abla f(x_n)||^2.Id$ and $v_n=A_n^{-1}. abla f(x
It has been widely recognized that the 0/1 loss function is one of the most natural choices for modelling classification errors, and it has a wide range of applications including support vector machines and 1-bit compressed sensing. Due to the combin
Nonsmooth optimization problems arising in practice tend to exhibit beneficial smooth substructure: their domains stratify into active manifolds of smooth variation, which common proximal algorithms identify in finite time. Identification then entail
Monotone systems of polynomial equations (MSPEs) are systems of fixed-point equations $X_1 = f_1(X_1, ..., X_n),$ $..., X_n = f_n(X_1, ..., X_n)$ where each $f_i$ is a polynomial with positive real coefficients. The question of computing the least no
In this paper, we develop convergence analysis of a modified line search method for objective functions whose value is computed with noise and whose gradient estimates are inexact and possibly random. The noise is assumed to be bounded in absolute va