ترغب بنشر مسار تعليمي؟ اضغط هنا

Krylov Subspace Recycling for Sequences of Shifted Linear Systems

271   0   0.0 ( 0 )
 نشر من قبل Kirk Soodhalter
 تاريخ النشر 2013
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

We study the use of Krylov subspace recycling for the solution of a sequence of slowly-changing families of linear systems, where each family consists of shifted linear systems that differ in the coefficient matrix only by multiples of the identity. Our aim is to explore the simultaneous solution of each family of shifted systems within the framework of subspace recycling, using one augmented subspace to extract candidate solutions for all the shifted systems. The ideal method would use the same augmented subspace for all systems and have fixed storage requirements, independent of the number of shifted systems per family. We show that a method satisfying both requirements cannot exist in this framework. As an alternative, we introduce two schemes. One constructs a separate deflation space for each shifted system but solves each family of shifted systems simultaneously. The other builds only one recycled subspace and constructs approximate corrections to the solutions of the shifted systems at each cycle of the iterative linear solver while only minimizing the base system residual. At convergence of the base system solution, we apply the method recursively to the remaining unconverged systems. We present numerical examples involving systems arising in lattice quantum chromodynamics.



قيم البحث

اقرأ أيضاً

200 - Kirk M. Soodhalter 2014
Many Krylov subspace methods for shifted linear systems take advantage of the invariance of the Krylov subspace under a shift of the matrix. However, exploiting this fact in the non-Hermitian case introduces restrictions; e.g., initial residuals must be collinear and this collinearity must be maintained at restart. Thus we cannot simultaneously solve shifted systems with unrelated right-hand sides using this strategy, and all shifted residuals cannot be simultaneously minimized over a Krylov subspace such that collinearity is maintained. It has been shown that this renders them generally incompatible with techniques of subspace recycling [Soodhalter et al. APNUM 14]. This problem, however, can be overcome. By interpreting a family of shifted systems as one Sylvester equation, we can take advantage of the known shift invariance of the Krylov subspace generated by the Sylvester operator. Thus we can simultaneously solve all systems over one block Krylov subspace using FOM or GMRES type methods, even when they have unrelated right-hand sides. Because residual collinearity is no longer a requirement at restart, these methods are fully compatible with subspace recycling techniques. Furthermore, we realize the benefits of block sparse matrix operations which arise in the context of high-performance computing applications. In this paper, we discuss exploiting this Sylvester equation point of view which has yielded methods for shifted systems which are compatible with unrelated right-hand sides. From this, we propose a recycled GMRES method for simultaneous solution of shifted systems.Numerical experiments demonstrate the effectiveness of the methods.
We develop K$omega$, an open-source linear algebra library for the shifted Krylov subspace methods. The methods solve a set of shifted linear equations $(z_k I-H)x^{(k)}=b, (k=0,1,2,...)$ for a given matrix $H$ and a vector $b$, simultaneously. The l eading order of the operational cost is the same as that for a single equation. The shift invariance of the Krylov subspace is the mathematical foundation of the shifted Krylov subspace methods. Applications in materials science are presented to demonstrate the advantages of the algorithm over the standard Krylov subspace methods such as the Lanczos method. We introduce benchmark calculations of (i) an excited (optical) spectrum and (ii) intermediate eigenvalues by the contour integral on the complex plane. In combination with the quantum lattice solver $mathcal{H} Phi$, K$omega$ can realize parallel computation of excitation spectra and intermediate eigenvalues for various quantum lattice models.
We introduce a randomized algorithm, namely RCHOL, to construct an approximate Cholesky factorization for a given Laplacian matrix (a.k.a., graph Laplacian). From a graph perspective, the exact Cholesky factorization introduces a clique in the underl ying graph after eliminating a row/column. By randomization, RCHOL only retains a sparse subset of the edges in the clique using a random sampling developed by Spielman and Kyng. We prove RCHOL is breakdown-free and apply it to solving large sparse linear systems with symmetric diagonally dominant matrices. In addition, we parallelize RCHOL based on the nested dissection ordering for shared-memory machines. We report numerical experiments that demonstrate the robustness and the scalability of RCHOL. For example, our parallel code scaled up to 64 threads on a single node for solving the 3D Poisson equation, discretized with the 7-point stencil on a $1024times 1024 times 1024$ grid, a problem that has one billion unknowns.
We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits th e low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by every processor. We present various numerical results to demonstrate the versatility and scalability of the parallel algorithm.
Gauss-Seidel (GS) relaxation is often employed as a preconditioner for a Krylov solver or as a smoother for Algebraic Multigrid (AMG). However, the requisite sparse triangular solve is difficult to parallelize on many-core architectures such as graph ics processing units (GPUs). In the present study, the performance of the traditional GS relaxation based on a triangular solve is compared with two-stage variants, replacing the direct triangular solve with a fixed number of inner Jacobi-Richardson (JR) iterations. When a small number of inner iterations is sufficient to maintain the Krylov convergence rate, the two-stage GS (GS2) often outperforms the traditional algorithm on many-core architectures. We also compare GS2 with JR. When they perform the same number of flops for SpMV (e.g. three JR sweeps compared to two GS sweeps with one inner JR sweep), the GS2 iterations, and the Krylov solver preconditioned with GS2, may converge faster than the JR iterations. Moreover, for some problems (e.g. elasticity), it was found that JR may diverge with a damping factor of one, whereas two-stage GS may improve the convergence with more inner iterations. Finally, to study the performance of the two-stage smoother and preconditioner for a practical problem, %(e.g. using tuned damping factors), these were applied to incompressible fluid flow simulations on GPUs.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا