بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Bounds on series-parallel slowdown

308 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Andr\\'as Salamon

تاريخ النشر 2009

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Andras Z. Salamon - Vashti Galpin

النظم الموزعة والتوازية والحوسبة العنقودية التعقيد الحسابي الأداء

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We use activity networks (task graphs) to model parallel programs and consider series-parallel extensions of these networks. Our motivation is two-fold: the benefits of series-parallel activity networks and the modelling of programming constructs, such as those imposed by current parallel computing environments. Series-parallelisation adds precedence constraints to an activity network, usually increasing its makespan (execution time). The slowdown ratio describes how additional constraints affect the makespan. We disprove an existing conjecture positing a bound of two on the slowdown when workload is not considered. Where workload is known, we conjecture that 4/3 slowdown is always achievable, and prove our conjecture for small networks using max-plus algebra. We analyse a polynomial-time algorithm showing that achieving 4/3 slowdown is in exp-APX. Finally, we discuss the implications of our results.

قيم البحث

136 - Grzegorz Kwasniewski , Marko Kabic , Tal Ben-Nun 2021

Matrix factorizations are among the most important building blocks of scientific computing. State-of-the-art libraries, however, are not communication-optimal, underutilizing current parallel architectures. We present novel algorithms for Cholesky an d LU factorizations that utilize an asymptotically communication-optimal 2.5D decomposition. We first establish a theoretical framework for deriving parallel I/O lower bounds for linear algebra kernels, and then utilize its insights to derive Cholesky and LU schedules, both communicating N^3/(P*sqrt(M)) elements per processor, where M is the local memory size. The empirical results match our theoretical analysis: our implementations communicate significantly less than Intel MKL, SLATE, and the asymptotically communication-optimal CANDMC and CAPITAL libraries. Our code outperforms these state-of-the-art libraries in almost all tested scenarios, with matrix sizes ranging from 2,048 to 262,144 on up to 512 CPU nodes of the Piz Daint supercomputer, decreasing the time-to-solution by up to three times. Our code is ScaLAPACK-compatible and available as an open-source library.

النظم الموزعة والتوازية والحوسبة العنقودية التعقيد الحسابي الأداء

A Note on Parallel Algorithmic Speedup Bounds

401 - Neil J. Gunther 2011

A parallel program can be represented as a directed acyclic graph. An important performance bound is the time to execute the critical path through the graph. We show how this performance metric is related to Amdahl speedup and the degree of average p arallelism. These bounds formally exclude superlinear performance.

النظم الموزعة والتوازية والحوسبة العنقودية الأداء

On Maximum Common Subgraph Problems in Series-Parallel Graphs

65 - Nils Kriege , Florian Kurpicz , Petra Mutzel 2017

The complexity of the maximum common connected subgraph problem in partial $k$-trees is still not fully understood. Polynomial-time solutions are known for degree-bounded outerplanar graphs, a subclass of the partial $2$-trees. On the other hand, the problem is known to be ${bf NP}$-hard in vertex-labeled partial $11$-trees of bounded degree. We consider series-parallel graphs, i.e., partial $2$-trees. We show that the problem remains ${bf NP}$-hard in biconnected series-parallel graphs with all but one vertex of degree $3$ or less. A positive complexity result is presented for a related problem of high practical relevance which asks for a maximum common connected subgraph that preserves blocks and bridges of the input graphs. We present a polynomial time algorithm for this problem in series-parallel graphs, which utilizes a combination of BC- and SP-tree data structures to decompose both graphs.

بنى وهياكل البيانات والخوارزميات التعقيد الحسابي

Topology Dependent Bounds For FAQs

245 - Michael Langberg , Shi Li , Sai Vikneshwar Mani Jayaraman 2020

In this paper, we prove topology dependent bounds on the number of rounds needed to compute Functional Aggregate Queries (FAQs) studied by Abo Khamis et al. [PODS 2016] in a synchronous distributed network under the model considered by Chattopadhyay et al. [FOCS 2014, SODA 2017]. Unlike the recent work on computing database queries in the Massively Parallel Computation model, in the model of Chattopadhyay et al., nodes can communicate only via private point-to-point channels and we are interested in bounds that work over an {em arbitrary} communication topology. This is the first work to consider more practically motivated problems in this distributed model. For the sake of exposition, we focus on two special problems in this paper: Boolean Conjunctive Query (BCQ) and computing variable/factor marginals in Probabilistic Graphical Models (PGMs). We obtain tight bounds on the number of rounds needed to compute such queries as long as the underlying hypergraph of the query is $O(1)$-degenerate and has $O(1)$-arity. In particular, the $O(1)$-degeneracy condition covers most well-studied queries that are efficiently computable in the centralized computation model like queries with constant treewidth. These tight bounds depend on a new notion of `width (namely internal-node-width) for Generalized Hypertree Decompositions (GHDs) of acyclic hypergraphs, which minimizes the number of internal nodes in a sub-class of GHDs. To the best of our knowledge, this width has not been studied explicitly in the theoretical database literature. Finally, we consider the problem of computing the product of a vector with a chain of matrices and prove tight bounds on its round complexity (over the finite field of two elements) using a novel min-entropy based argument.

النظم الموزعة والتوازية والحوسبة العنقودية التعقيد الحسابي قواعد البيانات

Lower bounds for maximal matchings and maximal independent sets

79 - Alkida Balliu , Sebastian Brandt , Juho Hirvonen 2019

There are distributed graph algorithms for finding maximal matchings and maximal independent sets in $O(Delta + log^* n)$ communication rounds; here $n$ is the number of nodes and $Delta$ is the maximum degree. The lower bound by Linial (1987, 1992) shows that the dependency on $n$ is optimal: these problems cannot be solved in $o(log^* n)$ rounds even if $Delta = 2$. However, the dependency on $Delta$ is a long-standing open question, and there is currently an exponential gap between the upper and lower bounds. We prove that the upper bounds are tight. We show that maximal matchings and maximal independent sets cannot be found in $o(Delta + log log n / log log log n)$ rounds with any randomized algorithm in the LOCAL model of distributed computing. As a corollary, it follows that there is no deterministic algorithm for maximal matchings or maximal independent sets that runs in $o(Delta + log n / log log n)$ rounds; this is an improvement over prior lower bounds also as a function of $n$.

النظم الموزعة والتوازية والحوسبة العنقودية التعقيد الحسابي

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

الجامعة الدولية الخاصة للعلوم والتكنولوجيا

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Bounds on series-parallel slowdown

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً