ﻻ يوجد ملخص باللغة العربية
Matrix multiplication $A^t A$ appears as intermediate operation during the solution of a wide set of problems. In this paper, we propose a new cache-oblivious algorithm for the $A^t A$ multiplication. Our algorithm, A$scriptstyle mathsf{T}$A, calls classical Strassens algorithm as sub-routine, decreasing the computational cost %(expressed in number of performed products) of the conventional $A^t A$ multiplication to $frac{2}{7}n^{log_2 7}$. It works for generic rectangular matrices and exploits the peculiar symmetry of the resulting product matrix for sparing memory. We used the MPI paradigm to implement A$scriptstyle mathsf{T}$A in parallel, and we tested its performances on a small subset of nodes of the Galileo cluster. Experiments highlight good scalability and speed-up, also thanks to minimal number of exchanged messages in the designed communication system. Parallel overhead and inherently sequential time fraction are negligible in the tested configurations.
We study neural networks whose only non-linear components are multipliers, to test a new training rule in a context where the precise representation of data is paramount. These networks are challenged to discover the rules of matrix multiplication, g
We propose a new algorithm for multiplying dense polynomials with integer coefficients in a parallel fashion, targeting multi-core processor architectures. Complexity estimates and experimental comparisons demonstrate the advantages of this new approach.
Sparse matrix-vector multiplication (spMVM) is the dominant operation in many sparse solvers. We investigate performance properties of spMVM with matrices of various sparsity patterns on the nVidia Fermi class of GPGPUs. A new padded jagged diagonals
In one of the most important methods in Density Functional Theory - the Full-Potential Linearized Augmented Plane Wave (FLAPW) method - dense generalized eigenproblems are organized in long sequences. Moreover each eigenproblem is strongly correlated
We present the submatrix method, a highly parallelizable method for the approximate calculation of inverse p-th roots of large sparse symmetric matrices which are required in different scientific applications. We follow the idea of Approximate Comput