No Arabic abstract
We consider three mathematically equivalent variants of the conjugate gradient (CG) algorithm and how they perform in finite precision arithmetic. It was shown in [{em Behavior of slightly perturbed Lanczos and conjugate-gradient recurrences}, Lin.~Alg.~Appl., 113 (1989), pp.~7-63] that under certain conditions the convergence of a slightly perturbed CG computation is like that of exact CG for a matrix with many eigenvalues distributed throughout tiny intervals about the eigenvalues of the given matrix, the size of the intervals being determined by how closely these conditions are satisfied. We determine to what extent each of these variants satisfies the desired conditions, using a set of test problems and show that there is significant correlation between how well these conditions are satisfied and how well the finite precision computation converges before reaching its ultimately attainable accuracy. We show that for problems where the width of the intervals containing the eigenvalues of the associated exact CG matrix makes a significant difference in the behavior of exact CG, the different CG variants behave differently in finite precision arithmetic. For problems where the interval width makes little difference or where the convergence of exact CG is essentially governed by the upper bound based on the square root of the condition number of the matrix, the different CG variants converge similarly in finite precision arithmetic until the ultimate level of accuracy is achieved, although this ultimate level of accuracy may be different for the different variants. This points to the need for testing new CG variants on problems that are especially sensitive to rounding errors.
Results of porting parts of the Lattice Quantum Chromodynamics code to modern FPGA devices are presented. A single-node, double precision implementation of the Conjugate Gradient algorithm is used to invert numerically the Dirac-Wilson operator on a 4-dimensional grid on a Xilinx Zynq evaluation board. The code is divided into two software/hardware parts in such a way that the entire multiplication by the Dirac operator is performed in programmable logic, and the rest of the algorithm runs on the ARM cores. Optimized data blocks are used to efficiently use data movement infrastructure allowing to reach intervals of 1 clock cycle. We show that the FPGA implementation can offer a comparable performance compared to that obtained using Intel Xeon Phi KNL.
The Gaver-Stehfest algorithm is widely used for numerical inversion of Laplace transform. In this paper we provide the first rigorous study of the rate of convergence of the Gaver-Stehfest algorithm. We prove that Gaver-Stehfest approximations converge exponentially fast if the target function is analytic in a neighbourhood of a point and they converge at a rate $o(n^{-k})$ if the target function is $(2k+3)$-times differentiable at a point.
In this paper, we present two algorithms based on the Froidure-Pin Algorithm for computing the structure of a finite semigroup from a generating set. As was the case with the original algorithm of Froidure and Pin, the algorithms presented here produce the left and right Cayley graphs, a confluent terminating rewriting system, and a reduced word of the rewriting system for every element of the semigroup. If $U$ is any semigroup, and $A$ is a subset of $U$, then we denote by $langle Arangle$ the least subsemigroup of $U$ containing $A$. If $B$ is any other subset of $U$, then, roughly speaking, the first algorithm we present describes how to use any information about $langle Arangle$, that has been found using the Froidure-Pin Algorithm, to compute the semigroup $langle Acup Brangle$. More precisely, we describe the data structure for a finite semigroup $S$ given by Froidure and Pin, and how to obtain such a data structure for $langle Acup Brangle$ from that for $langle Arangle$. The second algorithm is a lock-free concurrent version of the Froidure-Pin Algorithm.
The ESA space astrometry mission Gaia, planned to be launched in 2013, has been designed to make angular measurements on a global scale with micro-arcsecond accuracy. A key component of the data processing for Gaia is the astrometric core solution, which must implement an efficient and accurate numerical algorithm to solve the resulting, extremely large least-squares problem. The Astrometric Global Iterative Solution (AGIS) is a framework that allows to implement a range of different iterative solution schemes suitable for a scanning astrometric satellite. In order to find a computationally efficient and numerically accurate iteration scheme for the astrometric solution, compatible with the AGIS framework, we study an adaptation of the classical conjugate gradient (CG) algorithm, and compare it to the so-called simple iteration (SI) scheme that was previously known to converge for this problem, although very slowly. The different schemes are implemented within a software test bed for AGIS known as AGISLab, which allows to define, simulate and study scaled astrometric core solutions. After successful testing in AGISLab, the CG scheme has been implemented also in AGIS. The two algorithms CG and SI eventually converge to identical solutions, to within the numerical noise (of the order of 0.00001 micro-arcsec). These solutions are independent of the starting values (initial star catalogue), and we conclude that they are equivalent to a rigorous least-squares estimation of the astrometric parameters. The CG scheme converges up to a factor four faster than SI in the tested cases, and in particular spatially correlated truncation errors are much more efficiently damped out with the CG scheme.
We study the alternating algorithm for the computation of the metric projection onto the closed sum of two closed subspaces in uniformly convex and uniformly smooth Banach spaces. For Banach spaces which are convex and smooth of power type, we exhibit a condition which implies linear convergence of this method. We show these convergence results for iterates of Bregman projections onto closed linear subspaces. Using an intimate connection between the metric projection onto a closed linear subspace and the Bregman projection onto its annihilator, we deduce the convergence rate results for the alternating algorithm from the corresponding results for the iterated Bregman projection method.