ﻻ يوجد ملخص باللغة العربية
Tensors (also commonly seen as multi-linear operators or as multi-dimensional arrays) are ubiquitous in scientific computing and in data science, and so are the software efforts for tensor operations. Particularly in recent years, we have observed an explosion in libraries, compilers, packages, and toolboxes; unfortunately these efforts are very much scattered among the different scientific domains, and inevitably suffer from replication, suboptimal implementations, and in many cases, limited visibility. As a first step towards countering these inefficiencies, here we survey and loosely classify software packages related to tensor computations. Our aim is to assemble a comprehensive and up-to-date snapshot of the tensor software landscape, with the intention of helping both users and developers. Aware of the difficulties inherent in any multi-discipline survey, we very much welcome the readers help in amending and expanding our software list, which currently features 72 projects.
In this paper, we develop software for decomposing sparse tensors that is portable to and performant on a variety of multicore, manycore, and GPU computing architectures. The result is a single code whose performance matches optimized architecture-sp
Our goal is compression of massive-scale grid-structured data, such as the multi-terabyte output of a high-fidelity computational simulation. For such data sets, we have developed a new software package called TuckerMPI, a parallel C++/MPI software p
The computation of Feynman integrals often involves square roots. One way to obtain a solution in terms of multiple polylogarithms is to rationalize these square roots by a suitable variable change. We present a program that can be used to find such
Tensor contraction (TC) is an important computational kernel widely used in numerous applications. It is a multi-dimensional generalization of matrix multiplication (GEMM). While Strassens algorithm for GEMM is well studied in theory and practice, ex
On modern architectures, the performance of 32-bit operations is often at least twice as fast as the performance of 64-bit operations. By using a combination of 32-bit and 64-bit floating point arithmetic, the performance of many dense and sparse lin