No Arabic abstract
In contrast with classical Schwarz theory, recent results in computational chemistry have shown that for special domain geometries, the one-level parallel Schwarz method can be scalable. This property is not true in general, and the issue of quantifying the lack of scalability remains an open problem. Even though heuristic explanations are given in the literature, a rigorous and systematic analysis is still missing. In this short manuscript, we provide a first rigorous result that precisely quantifies the lack of scalability of the classical one-level parallel Schwarz method for the solution to the one-dimensional Laplace equation. Our analysis technique provides a possible roadmap for a systematic extension to more realistic problems in higher dimensions.
In this article, we analyse the convergence behaviour and scalability properties of the one-level Parallel Schwarz method (PSM) for domain decomposition problems in which the boundaries of many subdomains lie in the interior of the global domain. Such problems arise, for instance, in solvation models in computational chemistry. Existing results on the scalability of the one-level PSM are limited to situations where each subdomain has access to the external boundary, and at most only two subdomains have a common overlap. We develop a systematic framework that allows us to bound the norm of the Schwarz iteration operator for domain decomposition problems in which subdomains may be completely embedded in the interior of the global domain and an arbitrary number of subdomains may have a common overlap.
The discretization of surface intrinsic PDEs has challenges that one might not face in the flat space. The closest point method (CPM) is an embedding method that represents surfaces using a function that maps points in the flat space to their closest points on the surface. This mapping brings intrinsic data onto the embedding space, allowing us to numerically approximate PDEs by the standard methods in the tubular neighborhood of the surface. Here, we solve the surface intrinsic positive Helmholtz equation by the CPM paired with finite differences which usually yields a large, sparse, and non-symmetric system. Domain decomposition methods, especially Schwarz methods, are robust algorithms to solve these linear systems. While there have been substantial works on Schwarz methods, Schwarz methods for solving surface differential equations have not been widely analyzed. In this work, we investigate the convergence of the CPM coupled with Schwarz method on 1-manifolds in d-dimensional space of real numbers.
The Sinc-Nystr{o}m method in time is a high-order spectral method for solving evolutionary differential equations and it has wide applications in scientific computation. But in this method we have to solve all the time steps implicitly at one-shot, which may results in a large-scale nonsymmetric dense system that is expensive to solve. In this paper, we propose and analyze a parallel-in-time (PinT) preconditioner for solving such Sinc-Nystr{o}m systems, where both the parabolic and hyperbolic PDEs are investigated. Attributed to the special Toeplitz-like structure of the Sinc-Nystr{o}m systems, the proposed PinT preconditioner is indeed a low-rank perturbation of the system matrix and we show that the spectrum of the preconditioned system is highly clustered around one, especially when the time step size is refined. Such a clustered spectrum distribution matches very well with the numerically observed mesh-independent GMRES convergence rates in various examples. Several linear and nonlinear ODE and PDE examples are presented to illustrate the convergence performance of our proposed PinT preconditioners, where the achieved exponential order of accuracy are especially attractive to those applications in need of high accuracy.
In this paper, we propose an overlapping additive Schwarz method for total variation minimization based on a dual formulation. The $O(1/n)$-energy convergence of the proposed method is proven, where $n$ is the number of iterations. In addition, we introduce an interesting convergence property called pseudo-linear convergence of the proposed method; the energy of the proposed method decreases as fast as linearly convergent algorithms until it reaches a particular value. It is shown that such the particular value depends on the overlapping width $delta$, and the proposed method becomes as efficient as linearly convergent algorithms if $delta$ is large. As the latest domain decomposition methods for total variation minimization are sublinearly convergent, the proposed method outperforms them in the sense of the energy decay. Numerical experiments which support our theoretical results are provided.
In this work we investigate the parallel scalability of the numerical method developed in Guthrey and Rossmanith [The regionally implicit discontinuous Galerkin method: Improving the stability of DG-FEM, SIAM J. Numer. Anal. (2019)]. We develop an implementation of the regionally-implicit discontinuous Galerkin (RIDG) method in DoGPack, which is an open source C++ software package for discontinuous Galerkin methods. Specifically, we develop and test a hybrid OpenMP and MPI parallelized implementation of DoGPack with the goal of exploring the efficiency and scalability of RIDG in comparison to the popular strong stability-preserving Runge-Kutta discontinuous Galerkin (SSP-RKDG) method. We demonstrate that RIDG methods are able to hide communication latency associated with distributed memory parallelism, due to the fact that almost all of the work involved in the method is highly localized to each element, producing a localized prediction for each region. We demonstrate the enhanced efficiency and scalability of the of the RIDG method and compare it to SSP-RKDG methods and show extensibility to very high order schemes. The two-dimensional scaling study is performed on machines at the Institute for Cyber-Enabled Research at Michigan State University, using up to 1440 total cores on Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz CPUs. The three dimensional scaling study is performed on Livermore Computing clusters at at Lawrence Livermore National Laboratory, using up to 28672 total cores on Intel Xeon CLX-8276L CPUs with Omni-Path interconnects.