ترغب بنشر مسار تعليمي؟ اضغط هنا

Parallel Write-Efficient Algorithms and Data Structures for Computational Geometry

80   0   0.0 ( 0 )
 نشر من قبل Yan Gu
 تاريخ النشر 2018
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In this paper, we design parallel write-efficient geometric algorithms that perform asymptotically fewer writes than standard algorithms for the same problem. This is motivated by emerging non-volatile memory technologies with read performance being close to that of random access memory but writes being significantly more expensive in terms of energy and latency. We design algorithms for planar Delaunay triangulation, $k$-d trees, and static and dynamic augmented trees. Our algorithms are designed in the recently introduced Asymmetric Nested-Parallel Model, which captures the parallel setting in which there is a small symmetric memory where reads and writes are unit cost as well as a large asymmetric memory where writes are $omega$ times more expensive than reads. In designing these algorithms, we introduce several techniques for obtaining write-efficiency, including DAG tracing, prefix doubling, reconstruction-based rebalancing and $alpha$-labeling, which we believe will be useful for designing other parallel write-efficient algorithms.

قيم البحث

اقرأ أيضاً

121 - Xiaojun Dong , Yan Gu , Yihan Sun 2021
In this paper, we study the single-source shortest-path (SSSP) problem with positive edge weights, which is a notoriously hard problem in the parallel context. In practice, the $Delta$-stepping algorithm proposed by Meyer and Sanders has been widely adopted. However, $Delta$-stepping has no known worst-case bounds for general graphs. The performance of $Delta$-stepping also highly relies on the parameter $Delta$. There have also been lots of algorithms with theoretical bounds, such as Radius-stepping, but they either have no implementations available or are much slower than $Delta$-stepping in practice. We propose a stepping algorithm framework that generalizes existing algorithms such as $Delta$-stepping and Radius-stepping. The framework allows for similar analysis and implementations of all stepping algorithms. We also propose a new ADT, lazy-batched priority queue (LaB-PQ), that abstracts the semantics of the priority queue needed by the stepping algorithms. We provide two data structures for LaB-PQ, focusing on theoretical and practical efficiency, respectively. Based on the new framework and LaB-PQ, we show two new stepping algorithms, $rho$-stepping and $Delta^*$-stepping, that are simple, with non-trivial worst-case bounds, and fast in practice. The stepping algorithm framework also provides almost identical implementations for three algorithms: Bellman-Ford, $Delta^*$-stepping, and $rho$-stepping. We compare our code with four state-of-the-art implementations. On five social and web graphs, $rho$-stepping is 1.3--2.5x faster than all the existing implementations. On two road graphs, our $Delta^*$-stepping is at least 14% faster than existing implementations, while $rho$-stepping is also competitive. The almost identical implementations for stepping algorithms also allow for in-depth analyses and comparisons among the stepping algorithms in practice.
Hierarchical clustering is a fundamental task often used to discover meaningful structures in data, such as phylogenetic trees, taxonomies of concepts, subtypes of cancer, and cascades of particle decays in particle physics. Typically approximate alg orithms are used for inference due to the combinatorial number of possible hierarchical clusterings. In contrast to existing methods, we present novel dynamic-programming algorithms for emph{exact} inference in hierarchical clustering based on a novel trellis data structure, and we prove that we can exactly compute the partition function, maximum likelihood hierarchy, and marginal probabilities of sub-hierarchies and clusters. Our algorithms scale in time and space proportional to the powerset of $N$ elements which is super-exponentially more efficient than explicitly considering each of the (2N-3)!! possible hierarchies. Also, for larger datasets where our exact algorithms become infeasible, we introduce an approximate algorithm based on a sparse trellis that compares well to other benchmarks. Exact methods are relevant to data analyses in particle physics and for finding correlations among gene expression in cancer genomics, and we give examples in both areas, where our algorithms outperform greedy and beam search baselines. In addition, we consider Dasguptas cost with synthetic data.
The data-driven computing paradigm initially introduced by Kirchdoerfer and Ortiz (2016) enables finite element computations in solid mechanics to be performed directly from material data sets, without an explicit material model. From a computational effort point of view, the most challenging task is the projection of admissible states at material points onto their closest states in the material data set. In this study, we compare and develop several possible data structures for solving the nearest-neighbor problem. We show that approximate nearest-neighbor (ANN) algorithms can accelerate material data searches by several orders of magnitude relative to exact searching algorithms. The approximations are suggested by--and adapted to--the structure of the data-driven iterative solver and result in no significant loss of solution accuracy. We assess the performance of the ANN algorithm with respect to material data set size with the aid of a 3D elasticity test case. We show that computations on a single processor with up to one billion material data points are feasible within a few seconds execution time with a speedup of more than 106 with respect to exact k-d trees.
80 - Jan van den Brand 2020
Many algorithms use data structures that maintain properties of matrices undergoing some changes. The applications are wide-ranging and include for example matchings, shortest paths, linear programming, semi-definite programming, convex hull and volu me computation. Given the wide range of applications, the exact property these data structures must maintain varies from one application to another, forcing algorithm designers to invent them from scratch or modify existing ones. Thus it is not surprising that these data structures and their proofs are usually tailor-made for their specific application and that maintaining more complicated properties results in more complicated proofs. In this paper we present a unifying framework that captures a wide range of these data structures. The simplicity of this framework allows us to give short proofs for many existing data structures regardless of how complicated the to be maintained property is. We also show how the framework can be used to speed up existing iterative algorithms, such as the simplex algorithm. More formally, consider any rational function $f(A_1,...,A_d)$ with input matrices $A_1,...,A_d$. We show that the task of maintaining $f(A_1,...,A_d)$ under updates to $A_1,...,A_d$ can be reduced to the much simpler problem of maintaining some matrix inverse $M^{-1}$ under updates to $M$. The latter is a well studied problem called dynamic matrix inverse. By applying our reduction and using known algorithms for dynamic matrix inverse we can obtain fast data structures and iterative algorithms for much more general problems.
Over the past decade, there has been increasing interest in distributed/parallel algorithms for processing large-scale graphs. By now, we have quite fast algorithms -- usually sublogarithmic-time and often $poly(loglog n)$-time, or even faster -- for a number of fundamental graph problems in the massively parallel computation (MPC) model. This model is a widely-adopted theoretical abstraction of MapReduce style settings, where a number of machines communicate in an all-to-all manner to process large-scale data. Contributing to this line of work on MPC graph algorithms, we present $poly(log k) in poly(loglog n)$ round MPC algorithms for computing $O(k^{1+{o(1)}})$-spanners in the strongly sublinear regime of local memory. To the best of our knowledge, these are the first sublogarithmic-time MPC algorithms for spanner construction. As primary applications of our spanners, we get two important implications, as follows: -For the MPC setting, we get an $O(log^2log n)$-round algorithm for $O(log^{1+o(1)} n)$ approximation of all pairs shortest paths (APSP) in the near-linear regime of local memory. To the best of our knowledge, this is the first sublogarithmic-time MPC algorithm for distance approximations. -Our result above also extends to the Congested Clique model of distributed computing, with the same round complexity and approximation guarantee. This gives the first sub-logarithmic algorithm for approximating APSP in weighted graphs in the Congested Clique model.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا