بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems

167 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Gustavo Ch\\'avez

تاريخ النشر 2017

مجال البحث

والبحث باللغة English

تأليف Gustavo Chavez - George Turkiyyah - Stefano Zampini

تحليل PDES التحليل العددي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present Accelerated Cyclic Reduction (ACR), a distributed-memory fast direct solver for rank-compressible block tridiagonal linear systems arising from the discretization of elliptic operators, developed here for three dimensions. Algorithmic synergies between Cyclic Reduction and hierarchical matrix arithmetic operations result in a solver that has $O(k~N log N~(log N + k^2))$ arithmetic complexity and $O(k~N log N)$ memory footprint, where $N$ is the number of degrees of freedom and $k$ is the rank of a typical off-diagonal block, and which exhibits substantial concurrency. We provide a baseline for performance and applicability by comparing with the multifrontal method where hierarchical semi-separable matrices are used for compressing the fronts, and with algebraic multigrid. Over a set of large-scale elliptic systems with features of nonsymmetry and indefiniteness, the robustness of the direct solvers extends beyond that of the multigrid solver, and relative to the multifrontal approach ACR has lower or comparable execution time and memory footprint. ACR exhibits good strong and weak scaling in a distributed context and, as with any direct solver, is advantageous for problems that require the solution of multiple right-hand sides.

قيم البحث

190 - Chao Chen , Hadi Pouransari , Sivasankaran Rajamanickam 2017

We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits th e low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by every processor. We present various numerical results to demonstrate the versatility and scalability of the parallel algorithm.

التحليل العددي البرمجيات الرياضية التحليل العددي

Initial Guesses for Sequences of Linear Systems in a GPU-Accelerated Incompressible Flow Solver

106 - Anthony P. Austin , Noel Chalmers , Tim Warburton 2020

We consider several methods for generating initial guesses when iteratively solving sequences of linear systems, showing that they can be implemented efficiently in GPU-accelerated PDE solvers, specifically solvers for incompressible flow. We propose new initial guess methods based on stabilized polynomial extrapolation and compare them to the projection method of Fischer [15], showing that they are generally competitive with projection schemes despite requiring only half the storage and performing considerably less data movement and communication. Our implementations of these algorithms are freely available as part of the libParanumal collection of GPU-accelerated flow solvers.

التحليل العددي التحليل العددي

A numerical framework for computing steady states of size-structured population models and their stability

59 - Inom Mirzaev , David M. Bortz 2016

Structured population models are a class of general evolution equations which are widely used in the study of biological systems. Many theoretical methods are available for establishing existence and stability of steady states of general evolution eq uations. However, except for very special cases, finding an analytical form of stationary solutions for evolution equations is a challenging task. In the present paper, we develop a numerical framework for computing approximations to stationary solutions of general evolution equations, which can also be used to produce existence and stability regions for steady states. In particular, we use the Trotter-Kato Theorem to approximate the infinitesimal generator of an evolution equation on a finite dimensional space, which in turn reduces the evolution equation into a system of ordinary differential equations. Consequently, we approximate and study the asymptotic behavior of stationary solutions. We illustrate the convergence of our numerical framework by applying it to a linear Sinko-Streifer structured population model for which the exact form of the steady state is known. To further illustrate the utility of our approach, we apply our framework to nonlinear population balance equation, which is an extension of well-known Smoluchowksi coagulation-fragmentation model to biological populations. We also demonstrate that our numerical framework can be used to gain insight about the theoretical stability of the stationary solutions of the evolution equations. Furthermore, the open source Python program that we have developed for our numerical simulations is freely available from our Github repository (github.com/MathBioCU).

تحليل PDES التحليل العددي السكان والتطور

Analysis of a Population Model Structured by the Cells Molecular Content

356 - Marie Doumic Jauffret 2008

We study the mathematical properties of a general model of cell division structured with several internal variables. We begin with a simpler and specific model with two variables, we solve the eigenvalue problem with strong or weak assumptions, and d educe from it the long-time convergence. The main difficulty comes from natural degeneracy of birth terms that we overcome with a regularization technique. We then extend the results to the case with several parameters and recall the link between this simplified model and the one presented in cite{CBBP1}; an application to the non-linear problem is also given, leading to robust subpolynomial growth of the total population.

تحليل PDES التحليل العددي

CLAIRE: A distributed-memory solver for constrained large deformation diffeomorphic image registration

116 - Andreas Mang , Amir Gholami , Christos Davatzikos 2018

With this work, we release CLAIRE, a distributed-memory implementation of an effective solver for constrained large deformation diffeomorphic image registration problems in three dimensions. We consider an optimal control formulation. We invert for a stationary velocity field that parameterizes the deformation map. Our solver is based on a globalized, preconditioned, inexact reduced space Gauss--Newton--Krylov scheme. We exploit state-of-the-art techniques in scientific computing to develop an effective solver that scales to thousands of distributed memory nodes on high-end clusters. We present the formulation, discuss algorithmic features, describe the software package, and introduce an improved preconditioner for the reduced space Hessian to speed up the convergence of our solver. We test registration performance on synthetic and real data. We demonstrate registration accuracy on several neuroimaging datasets. We compare the performance of our scheme against different flavors of the Demons algorithm for diffeomorphic image registration. We study convergence of our preconditioner and our overall algorithm. We report scalability results on state-of-the-art supercomputing platforms. We demonstrate that we can solve registration problems for clinically relevant data sizes in two to four minutes on a standard compute node with 20 cores, attaining excellent data fidelity. With the present work we achieve a speedup of (on average) 5$times$ with a peak performance of up to 17$times$ compared to our former work.

التحسين والتحكم الرؤية الحاسوبية وتمييز الأنماط النظم الموزعة والتوازية والحوسبة العنقودية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة وهران احمد بن بله

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Accelerated Cyclic Reduction: A Distributed-Memory Fast Solver for Structured Linear Systems

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً