Look-Ahead in the Two-Sided Reduction to Compact Band Forms for Symmetric Eigenvalue Problems and the SVD

74 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Rafael Rodriguez-Sanchez

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Rafael Rodriguez-Sanchez - Sandra Catalan - Jose R. Herrero

البرمجيات الرياضية النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We address the reduction to compact band forms, via unitary similarity transformations, for the solution of symmetric eigenvalue problems and the computation of the singular value decomposition (SVD). Concretely, in the first case we revisit the reduction to symmetric band form while, for the second case, we propose a similar alternative, which transforms the original matrix to (unsymmetric) band form, replacing the conventional reduction method that produces a triangular--band output. In both cases, we describe algorithmic variants of the standard Level-3 BLAS-based procedures, enhanced with look-ahead, to overcome the performance bottleneck imposed by the panel factorization. Furthermore, our solutions employ an algorithmic block size that differs from the target bandwidth, illustrating the important performance benefits of this decision. Finally, we show that our alternative compact band form for the SVD is key to introduce an effective look-ahead strategy into the corresponding reduction procedure.

قيم البحث

92 - Xia Liao , Shengguo Li , Yutong Lu 2020

In this paper, a parallel structured divide-and-conquer (PSDC) eigensolver is proposed for symmetric tridiagonal matrices based on ScaLAPACK and a parallel structured matrix multiplication algorithm, called PSMMA. Computing the eigenvectors via matri x-matrix multiplications is the most computationally expensive part of the divide-and-conquer algorithm, and one of the matrices involved in such multiplications is a rank-structured Cauchy-like matrix. By exploiting this particular property, PSMMA constructs the local matrices by using generators of Cauchy-like matrices without any communication, and further reduces the computation costs by using a structured low-rank approximation algorithm. Thus, both the communication and computation costs are reduced. Experimental results show that both PSMMA and PSDC are highly scalable and scale to 4096 processes at least. PSDC has better scalability than PHDC that was proposed in [J. Comput. Appl. Math. 344 (2018) 512--520] and only scaled to 300 processes for the same matrices. Comparing with texttt{PDSTEDC} in ScaLAPACK, PSDC is always faster and achieves $1.4$x--$1.6$x speedup for some matrices with few deflations. PSDC is also comparable with ELPA, with PSDC being faster than ELPA when using few processes and a little slower when using many processes.

البرمجيات الرياضية النظم الموزعة والتوازية والحوسبة العنقودية

An Optimized and Scalable Eigensolver for Sequences of Eigenvalue Problems

492 - Mario Berljafa , Edoardo Di Napoli (2 2014

In many scientific applications the solution of non-linear differential equations are obtained through the set-up and solution of a number of successive eigenproblems. These eigenproblems can be regarded as a sequence whenever the solution of one pro blem fosters the initialization of the next. In addition, in some eigenproblem sequences there is a connection between the solutions of adjacent eigenproblems. Whenever it is possible to unravel the existence of such a connection, the eigenproblem sequence is said to be correlated. When facing with a sequence of correlated eigenproblems the current strategy amounts to solving each eigenproblem in isolation. We propose a alternative approach which exploits such correlation through the use of an eigensolver based on subspace iteration and accelerated with Chebyshev polynomials (ChFSI). The resulting eigensolver is optimized by minimizing the number of matrix-vector multiplications and parallelized using the Elemental library framework. Numerical results show that ChFSI achieves excellent scalability and is competitive with current dense linear algebra parallel eigensolvers.

البرمجيات الرياضية النظم الموزعة والتوازية والحوسبة العنقودية الفيزياء الحسابية

NEP: a module for the parallel solution of nonlinear eigenvalue problems in SLEPc

223 - Carmen Campos , Jose E. Roman 2019

SLEPc is a parallel library for the solution of various types of large-scale eigenvalue problems. In the last years we have been developing a module within SLEPc, called NEP, that is intended for solving nonlinear eigenvalue problems. These problems can be defined by means of a matrix-valued function that depends nonlinearly on a single scalar parameter. We do not consider the particular case of polynomial eigenvalue problems (which are implemented in a different module in SLEPc) and focus here on rational eigenvalue problems and other general nonlinear eigenproblems involving square roots or any other nonlinear function. The paper discusses how the NEP module has been designed to fit the needs of applications and provides a description of the available solvers, including some implementation details such as parallelization. Several test problems coming from real applications are used to evaluate the performance and reliability of the solvers.

البرمجيات الرياضية التحليل العددي التحليل العددي

Superconvergent Two-grid Methods For Elliptic Eigenvalue Problems

357 - Hailong Guo , Zhimin Zhang , Ren Zhao 2014

Some numerical algorithms for elliptic eigenvalue problems are proposed, analyzed, and numerically tested. The methods combine advantages of the two-grid algorithm, two-space method, the shifted inverse power method, and the polynomial preserving rec overy technique . Our new algorithms compare favorably with some existing methods and enjoy superconvergence property.

التحليل العددي

Batched QR and SVD Algorithms on GPUs with Applications in Hierarchical Matrix Compression

117 - Wajih Halim Boukaram , George Turkiyyah , Hatem Ltaief 2017

We present high performance implementations of the QR and the singular value decomposition of a batch of small matrices hosted on the GPU with applications in the compression of hierarchical matrices. The one-sided Jacobi algorithm is used for its si mplicity and inherent parallelism as a building block for the SVD of low rank blocks using randomized methods. We implement multiple kernels based on the level of the GPU memory hierarchy in which the matrices can reside and show substantial speedups against streamed cuSOLVER SVDs. The resulting batched routine is a key component of hierarchical matrix compression, opening up opportunities to perform H-matrix arithmetic efficiently on GPUs.

البرمجيات الرياضية بنى وهياكل البيانات والخوارزميات التحليل العددي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

المعهد الوطني للإدارة العامة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Look-Ahead in the Two-Sided Reduction to Compact Band Forms for Symmetric Eigenvalue Problems and the SVD

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً