ترغب بنشر مسار تعليمي؟ اضغط هنا

Accurate Computation of the Log-Sum-Exp and Softmax Functions

99   0   0.0 ( 0 )
 نشر من قبل Desmond Higham J
 تاريخ النشر 2019
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Evaluating the log-sum-exp function or the softmax function is a key step in many modern data science algorithms, notably in inference and classification. Because of the exponentials that these functions contain, the evaluation is prone to overflow and underflow, especially in low precision arithmetic. Software implementations commonly use alternative formulas that avoid overflow and reduce the chance of harmful underflow, employing a shift or another rewriting. Although mathematically equivalent, these variants behave differently in floating-point arithmetic. We give rounding error analyses of different evaluation algorithms and interpret the error bounds using condition numbers for the functions. We conclude, based on the analysis and numerical experiments, that the shifted formulas are of similar accuracy to the unshifted ones and that the shifted softmax formula is typically more accurate than a division-free variant.

قيم البحث

اقرأ أيضاً

118 - Siyu Yang , Dongping Li 2021
In this paper, we develop efficient and accurate algorithms for evaluating $varphi(A)$ and $varphi(A)b$, where $A$ is an $Ntimes N$ matrix, $b$ is an $N$ dimensional vector and $varphi$ is the function defined by $varphi(x)equivsumlimits^{infty}_{k=0 }frac{z^k}{(1+k)!}$. Such matrix function (the so-called $varphi$-function) plays a key role in a class of numerical methods well-known as exponential integrators. The algorithms use the scaling and modified squaring procedure combined with truncated Taylor series. The backward error analysis is presented to find the optimal value of the scaling and the degree of the Taylor approximation. Some useful techniques are employed for reducing the computational cost. Numerical comparisons with state-of-the-art algorithms show that the algorithms perform well in both accuracy and efficiency.
In this paper a method is presented for evaluating the convolution of the Greens function for the Laplace operator with a specified function $rho(vec x)$ at all grid points in a rectangular domain $Omega subset {mathrm R}^{d}$ ($d = 1,2,3$), i.e. a s olution of Poissons equation in an infinite domain. 4th and 6th ord
86 - S.J. Hamilton , J.L. Mueller , 2017
Objective: Absolute images have important applications in medical Electrical Impedance Tomography (EIT) imaging, but the traditional minimization and statistical based computations are very sensitive to modeling errors and noise. In this paper, it is demonstrated that D-bar reconstruction methods for absolute EIT are robust to such errors. Approach: The effects of errors in domain shape and electrode placement on absolute images computed with 2D D-bar reconstruction algorithms are studied on experimental data. Main Results: It is demonstrated with tank data from several EIT systems that these methods are quite robust to such modeling errors, and furthermore the artefacts arising from such modeling errors are similar to those occurring in classic time-difference EIT imaging. Significance: This study is promising for clinical applications where absolute EIT images are desirable, but previously thought impossible.
In this paper we propose a method for computing the Faddeeva function $w(z) := e^{-z^2}mathrm{erfc}(-i z)$ via truncated modified trapezoidal rule approximations to integrals on the real line. Our starting point is the method due to Matta and Reichel (Math. Comp. 25 (1971), pp. 339-344) and Hunter and Regan (Math. Comp. 26 (1972), pp. 339-541). Addressing shortcomings flagged by Weideman (SIAM. J. Numer. Anal. 31 (1994), pp. 1497-1518), we construct approximations which we prove are exponentially convergent as a function of $N+1$, the number of quadrature points, obtaining error bounds which show that accuracies of $2times 10^{-15}$ in the computation of $w(z)$ throughout the complex plane are achieved with $N = 11$, this confirmed by computations. These approximations, moreover, provably achieve small relative errors throughout the upper complex half-plane where $w(z)$ is non-zero. Numerical tests suggest that this new method is competitive, in accuracy and computation times, with existing methods for computing $w(z)$ for complex $z$.
A piecewise Chebyshevian spline space is good for design when it possesses a B-spline basis and this property is preserved under arbitrary knot insertion. The interest in piecewise Chebyshevian spline spaces that are good for design is justified by t he fact that, similarly as for polynomial splines, the related parametric curves exhibit the desired properties of convex hull inclusion, variation diminution and intuitive relation between the curve shape and the location of the control points. For all good-for-design spaces, in this paper we construct a set of functions, called transition functions, which allow for efficient computation of the B-spline basis, even in the case of nonuniform and multiple knots. Moreover, we show how the spline coefficients of the representations associated with a refined knot partition and with a raised order can conveniently be expressed by means of transition functions. This result allows us to provide effective procedures that generalize the classical knot insertion and degree raising algorithms for polynomial splines. To illustrate the benefits of the proposed computational approaches, we provide several examples dealing with different types of piecewise Chebyshevian spline spaces that are good for design.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا