ترغب بنشر مسار تعليمي؟ اضغط هنا

Sub-sampled Newton Methods with Non-uniform Sampling

184   0   0.0 ( 0 )
 نشر من قبل Peng Xu
 تاريخ النشر 2016
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

We consider the problem of finding the minimizer of a convex function $F: mathbb R^d rightarrow mathbb R$ of the form $F(w) := sum_{i=1}^n f_i(w) + R(w)$ where a low-rank factorization of $ abla^2 f_i(w)$ is readily available. We consider the regime where $n gg d$. As second-order methods prove to be effective in finding the minimizer to a high-precision, in this work, we propose randomized Newton-type algorithms that exploit textit{non-uniform} sub-sampling of ${ abla^2 f_i(w)}_{i=1}^{n}$, as well as inexact updates, as means to reduce the computational complexity. Two non-uniform sampling distributions based on {it block norm squares} and {it block partial leverage scores} are considered in order to capture important terms among ${ abla^2 f_i(w)}_{i=1}^{n}$. We show that at each iteration non-uniformly sampling at most $mathcal O(d log d)$ terms from ${ abla^2 f_i(w)}_{i=1}^{n}$ is sufficient to achieve a linear-quadratic convergence rate in $w$ when a suitable initial point is provided. In addition, we show that our algorithms achieve a lower computational complexity and exhibit more robustness and better dependence on problem specific quantities, such as the condition number, compared to similar existing methods, especially the ones based on uniform sampling. Finally, we empirically demonstrate that our methods are at least twice as fast as Newtons methods with ridge logistic regression on several real datasets.

قيم البحث

اقرأ أيضاً

We consider the problem of minimizing a sum of $n$ functions over a convex parameter set $mathcal{C} subset mathbb{R}^p$ where $ngg pgg 1$. In this regime, algorithms which utilize sub-sampling techniques are known to be effective. In this paper, we use sub-sampling techniques together with low-rank approximation to design a new randomized batch algorithm which possesses comparable convergence rate to Newtons method, yet has much smaller per-iteration cost. The proposed algorithm is robust in terms of starting point and step size, and enjoys a composite convergence rate, namely, quadratic convergence at start and linear convergence when the iterate is close to the minimizer. We develop its theoretical analysis which also allows us to select near-optimal algorithm parameters. Our theoretical results can be used to obtain convergence rates of previously proposed sub-sampling based algorithms as well. We demonstrate how our results apply to well-known machine learning problems. Lastly, we evaluate the performance of our algorithm on several datasets under various scenarios.
For solving large-scale non-convex problems, we propose inexact variants of trust region and adaptive cubic regularization methods, which, to increase efficiency, incorporate various approximations. In particular, in addition to approximate sub-probl em solves, both the Hessian and the gradient are suitably approximated. Using rather mild conditions on such approximations, we show that our proposed inexact methods achieve similar optimal worst-case iteration complexities as the exact counterparts. Our proposed algorithms, and their respective theoretical analysis, do not require knowledge of any unknowable problem-related quantities, and hence are easily implementable in practice. In the context of finite-sum problems, we then explore randomized sub-sampling methods as ways to construct the gradient and Hessian approximations and examine the empirical performance of our algorithms on some real datasets.
In this paper, we consider stochastic second-order methods for minimizing a finite summation of nonconvex functions. One important key is to find an ingenious but cheap scheme to incorporate local curvature information. Since the true Hessian matrix is often a combination of a cheap part and an expensive part, we propose a structured stochastic quasi-Newton method by using partial Hessian information as much as possible. By further exploiting either the low-rank structure or the kronecker-product properties of the quasi-Newton approximations, the computation of the quasi-Newton direction is affordable. Global convergence to stationary point and local superlinear convergence rate are established under some mild assumptions. Numerical results on logistic regression, deep autoencoder networks and deep convolutional neural networks show that our proposed method is quite competitive to the state-of-the-art methods.
Scalar diffraction calculations such as the angular spectrum method (ASM) and Fresnel diffraction, are widely used in the research fields of optics, X-rays, electron beams, and ultrasonics. It is possible to accelerate the calculation using fast Four ier transform (FFT); unfortunately, acceleration of the calculation of non-uniform sampled planes is limited due to the property of the FFT that imposes uniform sampling. In addition, it gives rise to wasteful sampling data if we calculate a plane having locally low and high spatial frequencies. In this paper, we developed non-uniform sampled ASM and Fresnel diffraction to improve the problem using the non-uniform FFT.
To obtain the initial pressure from the collected data on a planar sensor arrangement in photoacoustic tomography, there exists an exact analytic frequency domain reconstruction formula. An efficient realization of this formula needs to cope with the evaluation of the datas Fourier transform on a non-equispaced mesh. In this paper, we use the non-uniform fast Fourier transform to handle this issue and show its feasibility in 3D experiments with real and synthetic data. This is done in comparison to the standard approach that uses linear, polynomial or nearest neighbor interpolation. Moreover, we investigate the effect and the utility of flexible sensor location to make optimal use of a limited number of sensor points. The computational realization is accomplished by the use of a multi-dimensional non-uniform fast Fourier algorithm, where non-uniform data sampling is performed both in frequency and spatial domain. Examples with synthetic and real data show that both approaches improve image quality.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا