ترغب بنشر مسار تعليمي؟ اضغط هنا

Combinatorial BLAS 2.0: Scaling combinatorial algorithms on distributed-memory systems

349   0   0.0 ( 0 )
 نشر من قبل Aydin Buluc
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

Combinatorial algorithms such as those that arise in graph analysis, modeling of discrete systems, bioinformatics, and chemistry, are often hard to parallelize. The Combinatorial BLAS library implements key computational primitives for rapid development of combinatorial algorithms in distributed-memory systems. During the decade since its first introduction, the Combinatorial BLAS library has evolved and expanded significantly. This paper details many of the key technical features of Combinatorial BLAS version 2.0, such as communication avoidance, hierarchical parallelism via in-node multithreading, accelerator support via GPU kernels, generalized semiring support, implementations of key data structures and functions, and scalable distributed I/O operations for human-readable files. Our paper also presents several rules of thumb for choosing the right data structures and functions in Combinatorial BLAS 2.0, under various common application scenarios.



قيم البحث

اقرأ أيضاً

We present a new $4$-approximation algorithm for the Combinatorial Motion Planning problem which runs in $mathcal{O}(n^2alpha(n^2,n))$ time, where $alpha$ is the functional inverse of the Ackermann function, and a fully distributed version for the sa me in asynchronous message passing systems, which runs in $mathcal{O}(nlog_2n)$ time with a message complexity of $mathcal{O}(n^2)$. This also includes the first fully distributed algorithm in asynchronous message passing systems to perform shortcut operations on paths, a procedure which is important in approximation algorithms for the vehicle routing problem and its variants. We also show that our algorithm gives feasible solutions to the $k$-TSP problem with an approximation factor of $2$ in both centralized and distributed environments. The broad idea of the algorithm is to distribute the set of vertices into two subsets and construct paths for each salesman over each of the two subsets. Finally we combine these pairwise disjoint paths for each salesman to obtain a set of paths that span the entire graph. This is similar to the algorithm by Yadlapalli et. al. cite{3.66} but differs in respect to the fact that it does not require us to use minimum cost matching as a subroutine, and hence can be easily distributed.
63 - Haike Xu , Jian Li 2021
We consider the stochastic combinatorial semi-bandit problem with adversarial corruptions. We provide a simple combinatorial algorithm that can achieve a regret of $tilde{O}left(C+d^2K/Delta_{min}right)$ where $C$ is the total amount of corruptions, $d$ is the maximal number of arms one can play in each round, $K$ is the number of arms. If one selects only one arm in each round, we achieves a regret of $tilde{O}left(C+sum_{Delta_i>0}(1/Delta_i)right)$. Our algorithm is combinatorial and improves on the previous combinatorial algorithm by [Gupta et al., COLT2019] (their bound is $tilde{O}left(KC+sum_{Delta_i>0}(1/Delta_i)right)$), and almost matches the best known bounds obtained by [Zimmert et al., ICML2019] and [Zimmert and Seldin, AISTATS2019] (up to logarithmic factor). Note that the algorithms in [Zimmert et al., ICML2019] and [Zimmert and Seldin, AISTATS2019] require one to solve complex convex programs while our algorithm is combinatorial, very easy to implement, requires weaker assumptions and has very low oracle complexity and running time. We also study the setting where we only get access to an approximation oracle for the stochastic combinatorial semi-bandit problem. Our algorithm achieves an (approximation) regret bound of $tilde{O}left(dsqrt{KT}right)$. Our algorithm is very simple, only worse than the best known regret bound by $sqrt{d}$, and has much lower oracle complexity than previous work.
135 - Thomas Fernique 2010
A combinatorial substitution is a map over tilings which allows to define sets of tilings with a strong hierarchical structure. In this paper, we show that such sets of tilings are sofic, that is, can be enforced by finitely many local constraints. T his extends some similar previous results (Mozes90, Goodman-Strauss98) in a much shorter presentation.
In its most elementary form, compressed sensing studies the design of decoding algorithms to recover a sufficiently sparse vector or code from a lower dimensional linear measurement vector. Typically it is assumed that the decoder has access to the e ncoder matrix, which in the combinatorial case is sparse and binary. In this paper we consider the problem of designing a decoder to recover a set of sparse codes from their linear measurements alone, that is without access to encoder matrix. To this end we study the matrix factorisation task of recovering both the encoder and sparse coding matrices from the associated linear measurement matrix. The contribution of this paper is a computationally efficient decoding algorithm, Decoder-Expander Based Factorisation, with strong performance guarantees. In particular, under mild assumptions on the sparse coding matrix and by deploying a novel random encoder matrix, we prove that Decoder-Expander Based Factorisation recovers both the encoder and sparse coding matrix at the optimal measurement rate with high probability and from a near optimal number of measurement vectors. In addition, our experiments demonstrate the efficacy and computational efficiency of our algorithm in practice. Beyond compressed sensing our results may be of interest for researchers working in areas such as linear sketching, coding theory and matrix compression.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا