بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Inv-ASKIT: A Parallel Fast Diret Solver for Kernel Matrices

67 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل William March

تاريخ النشر 2016

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Chenhan D. Yu - William B. March - Bo Xiao

التحليل العددي بنى وهياكل البيانات والخوارزميات البرمجيات الرياضية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We present a parallel algorithm for computing the approximate factorization of an $N$-by-$N$ kernel matrix. Once this factorization has been constructed (with $N log^2 N $ work), we can solve linear systems with this matrix with $N log N $ work. Kernel matrices represent pairwise interactions of points in metric spaces. They appear in machine learning, approximation theory, and computational physics. Kernel matrices are typically dense (matrix multiplication scales quadratically with $N$) and ill-conditioned (solves can require 100s of Krylov iterations). Thus, fast algorithms for matrix multiplication and factorization are critical for scalability. Recently we introduced ASKIT, a new method for approximating a kernel matrix that resembles N-body methods. Here we introduce INV-ASKIT, a factorization scheme based on ASKIT. We describe the new method, derive complexity estimates, and conduct an empirical study of its accuracy and scalability. We report results on real-world datasets including COVTYPE ($0.5$M points in 54 dimensions), SUSY ($4.5$M points in 8 dimensions) and MNIST (2M points in 784 dimensions) using shared and distributed memory parallelism. In our largest run we approximately factorize a dense matrix of size 32M $times$ 32M (generated from points in 64 dimensions) on 4,096 Sandy-Bridge cores. To our knowledge these results improve the state of the art by several orders of magnitude.

قيم البحث

360 - Robert Speck , Daniel Ruprecht , Matthew Emmett 2013

The paper presents a combination of the time-parallel parallel full approximation scheme in space and time (PFASST) with a parallel multigrid method (PMG) in space, resulting in a mesh-based solver for the three-dimensional heat equation with a uniqu ely high degree of efficient concurrency. Parallel scaling tests are reported on the Cray XE6 machine Monte Rosa on up to 16,384 cores and on the IBM Blue Gene/Q system JUQUEEN on up to 65,536 cores. The efficacy of the combined spatial- and temporal parallelization is shown by demonstrating that using PFASST in addition to PMG significantly extends the strong-scaling limit. Implications of using spatial coarsening strategies in PFASSTs multi-level hierarchy in large-scale parallel simulations are discussed.

التحليل العددي النظم الموزعة والتوازية والحوسبة العنقودية التحليل العددي

Fast algorithms for anti-distance matrices as a generalization of Boolean matrices

62 - Michiel de Bondt 2017

We show that Boolean matrix multiplication, computed as a sum of products of column vectors with row vectors, is essentially the same as Warshalls algorithm for computing the transitive closure matrix of a graph from its adjacency matrix. Warshalls algorithm can be generalized to Floyds algorithm for computing the distance matrix of a graph with weighted edges. We will generalize Boolean matrices in the same way, keeping matrix multiplication essentially equivalent to the Floyd-Warshall algorithm. This way, we get matrices over a semiring, which are similar to the so-called funny matrices. We discuss our implementation of operations on Boolean matrices and on their generalization, which make use of vector instructions.

الرياضيات المتقطعة بنى وهياكل البيانات والخوارزميات

Finite Projective Geometry based Fast, Conflict-free Parallel Matrix Computations

339 - Shreeniwas Sapre , Hrishikesh Sharma , Abhishek Patil 2011

Matrix computations, especially iterative PDE solving (and the sparse matrix vector multiplication subproblem within) using conjugate gradient algorithm, and LU/Cholesky decomposition for solving system of linear equations, form the kernel of many ap plications, such as circuit simulators, computational fluid dynamics or structural analysis etc. The problem of designing approaches for parallelizing these computations, to get good speedups as much as possible as per Amdahls law, has been continuously researched upon. In this paper, we discuss approaches based on the use of finite projective geometry graphs for these two problems. For the problem of conjugate gradient algorithm, the approach looks at an alternative data distribution based on projective-geometry concepts. It is proved that this data distribution is an optimal data distribution for scheduling the main problem of dense matrix-vector multiplication. For the problem of parallel LU/Cholesky decomposition of general matrices, the approach is motivated by the recently published scheme for interconnects of distributed systems, perfect difference networks. We find that projective-geometry based graphs indeed offer an exciting way of parallelizing these computations, and in fact many others. Moreover, their applications ranges from architectural ones (interconnect choice) to algorithmic ones (data distributions).

التحليل العددي النظم الموزعة والتوازية والحوسبة العنقودية التحليل العددي

The Normalized Singular Value Decomposition of Non-Symmetric Matrices Using Givens fast Rotations

82 - Ehsan Rohani , Gwan Choi , Mi Lu 2017

In this paper we introduce the algorithm and the fixed point hardware to calculate the normalized singular value decomposition of a non-symmetric matrices using Givens fast (approximate) rotations. This algorithm only uses the basic combinational log ic modules such as adders, multiplexers, encoders, Barrel shifters (B-shifters), and comparators and does not use any lookup table. This method in fact combines the iterative properties of singular value decomposition method and CORDIC method in one single iteration. The introduced architecture is a systolic architecture that uses two different types of processors, diagonal and non-diagonal processors. The diagonal processor calculates, transmits and applies the horizontal and vertical rotations, while the non-diagonal processor uses a fully combinational architecture to receive, and apply the rotations. The diagonal processor uses priority encoders, Barrel shifters, and comparators to calculate the rotation angles. Both processors use a series of adders to apply the rotation angles. The design presented in this work provides $2.83sim649$ times better energy per matrix performance compared to the state of the art designs. This performance achieved without the employment of pipelining; a better performance advantage is expected to be achieved employing pipelining.

التحليل العددي هندسة العتاد

PyDEC: Software and Algorithms for Discretization of Exterior Calculus

583 - Nathan Bell , Anil N. Hirani 2011

This paper describes the algorithms, features and implementation of PyDEC, a Python library for computations related to the discretization of exterior calculus. PyDEC facilitates inquiry into both physical problems on manifolds as well as purely topo logical problems on abstract complexes. We describe efficient algorithms for constructing the operators and objects that arise in discrete exterior calculus, lowest order finite element exterior calculus and in related topological problems. Our algorithms are formulated in terms of high-level matrix operations which extend to arbitrary dimension. As a result, our implementations map well to the facilities of numerical libraries such as NumPy and SciPy. The availability of such libraries makes Python suitable for prototyping numerical methods. We demonstrate how PyDEC is used to solve physical and topological problems through several concise examples.

التحليل العددي الهندسة الحسابية البرمجيات الرياضية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة تشرين

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Inv-ASKIT: A Parallel Fast Diret Solver for Kernel Matrices

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً