Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Monotonically relaxing concurrent data-structure semantics for performance: An efficient 2D design framework

185 0 0.0 ( 0 )

Download Cite

Added by Adones Rukundo Mr

Publication date 2019

fields Informatics Engineering

and research's language is English

Authors Adones Rukundo - Aras Atalar - Philippas Tsigas

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

There has been a significant amount of work in the literature proposing semantic relaxation of concurrent data structures for improving scalability and performance. By relaxing the semantics of a data structure, a bigger design space, that allows weaker synchronization and more useful parallelism, is unveiled. Investigating new data structure designs, capable of trading semantics for achieving better performance in a monotonic way, is a major challenge in the area. We algorithmically address this challenge in this paper. We present an efficient, lock-free, concurrent data structure design framework for out-of-order semantic relaxation. Our framework introduces a new two dimensional algorithmic design, that uses multiple instances of a given data structure. The first dimension of our design is the number of data structure instances operations are spread to, in order to benefit from parallelism through disjoint memory access. The second dimension is the number of consecutive operations that try to use the same data structure instance in order to benefit from data locality. Our design can flexibly explore this two-dimensional space to achieve the property of monotonically relaxing concurrent data structure semantics for achieving better throughput performance within a tight deterministic relaxation bound, as we prove in the paper. We show how our framework can instantiate lock-free out-of-order queues, stacks, counters and dequeues. We provide implementations of these relaxed data structures and evaluate their performance and behaviour on two parallel architectures. Experimental evaluation shows that our two-dimensional data structures significantly outperform the respected previous proposed ones with respect to scalability and throughput performance. Moreover, their throughput increases monotonically as relaxation increases.

rate research

Efficient data streaming multiway aggregation through concurrent algorithmic designs and new abstract data types

57 - Vincenzo Gulisano , Yiannis Nikolakopoulos , Daniel Cederman 2016

Data streaming relies on continuous queries to process unbounded streams of data in a real-time fashion. It is commonly demanding in computation capacity, given that the relevant applications involve very large volumes of data. Data structures act as articulation points and maintain the state of data streaming operators, potentially supporting high parallelism and balancing the work between them. Prompted by this fact, in this work we study and analyze parallelization needs of these articulation points, focusing on the problem of streaming multiway aggregation, where large data volumes are received from multiple input streams. The analysis of the parallelization needs, as well as of the use and limitations of existing aggregate designs and their data structures, leads us to identify needs for proper shared objects that can achieve low-latency and high throughput multiway aggregation. We present the requirements of such objects as abstract data types and we provide efficient lock-free linearizable algorithmic implementations of them, along with new multiway aggregate algorithmic designs that leverage them, supporting both deterministic order-sensitive and order-insensitive aggregate functions. Furthermore, we point out future directions that open through these contributions. The paper includes an extensive experimental study, based on a variety of aggregation continuous queries on two large datasets extracted from SoundCloud, a music social network, and from a Smart Grid network. In all the experiments, the proposed data structures and the enhanced aggregate operators improved the processing performance significantly, up to one order of magnitude, in terms of both throughput and latency, over the commonly-used techniques based on queues.

Data Structures and Algorithms Distributed Parallel and Cluster Computing

An Efficient Data Structure for Dynamic Two-Dimensional Reconfiguration

70 - Sandor P. Fekete , Jan-Marc Reinhardt , Christian Scheffer 2017

In the presence of dynamic insertions and deletions into a partially reconfigurable FPGA, fragmentation is unavoidable. This poses the challenge of developing efficient approaches to dynamic defragmentation and reallocation. One key aspect is to develop efficient algorithms and data structures that exploit the two-dimensional geometry of a chip, instead of just one. We propose a new method for this task, based on the fractal structure of a quadtree, which allows dynamic segmentation of the chip area, along with dynamically adjusting the necessary communication infrastructure. We describe a number of algorithmic aspects, and present different solutions. We also provide a number of basic simulations that indicate that the theoretical worst-case bound may be pessimistic.

Data Structures and Algorithms

Communication-Efficient Jaccard Similarity for High-Performance Distributed Genome Comparisons

289 - Maciej Besta , Raghavendra Kanakagiri , Harun Mustafa 2019

The Jaccard similarity index is an important measure of the overlap of two sets, widely used in machine learning, computational genomics, information retrieval, and many other areas. We design and implement SimilarityAtScale, the first communication-efficient distributed algorithm for computing the Jaccard similarity among pairs of large datasets. Our algorithm provides an efficient encoding of this problem into a multiplication of sparse matrices. Both the encoding and sparse matrix product are performed in a way that minimizes data movement in terms of communication and synchronization costs. We apply our algorithm to obtain similarity among all pairs of a set of large samples of genomes. This task is a key part of modern metagenomics analysis and an evergrowing need due to the increasing availability of high-throughput DNA sequencing data. The resulting scheme is the first to enable accurate Jaccard distance derivations for massive datasets, using largescale distributed-memory systems. We package our routines in a tool, called GenomeAtScale, that combines the proposed algorithm with tools for processing input sequences. Our evaluation on real data illustrates that one can use GenomeAtScale to effectively employ tens of thousands of processors to reach new frontiers in large-scale genomic and metagenomic analysis. While GenomeAtScale can be used to foster DNA research, the more general underlying SimilarityAtScale algorithm may be used for high-performance distributed similarity computations in other data analytics application domains.

Computational Engineering Distributed Parallel and Cluster Computing Performance

Linearizable Iterators for Concurrent Sets

81 - Archita Agarwal , Zhiyu Liu , Eli Rosenthal 2017

This paper proposes a general framework for adding linearizable iterators to a class of data structures that implement set operations. We introduce a condition on set operations, called local consistency, which informally states that set operations never make elements unreachable to a sequential iterators traversal. We show that sets with locally consistent operations can be augmented with a linearizable iterator via the framework. Our technique is broadly applicable to a variety of data structures, including hash tables and binary search trees. We apply the technique to sets taken from existing literature, prove their operations are locally consistent, and demonstrate that iterators do not significantly affect the performance of concurrent set operations.

Data Structures and Algorithms Distributed Parallel and Cluster Computing

Dolha - an Efficient and Exact Data Structure for Streaming Graphs

357 - Fan Zhang , Lei Zou , Li Zeng 2019

A streaming graph is a graph formed by a sequence of incoming edges with time stamps. Unlike static graphs, the streaming graph is highly dynamic and time related. In the real world, the high volume and velocity streaming graphs such as internet traffic data, social network communication data and financial transfer data are bringing challenges to the classic graph data structures. We present a new data structure: double orthogonal list in hash table (Dolha) which is a high speed and high memory efficiency graph structure applicable to streaming graph. Dolha has constant time cost for single edge and near linear space cost that we can contain billions of edges information in memory size and process an incoming edge in nanoseconds. Dolha also has linear time cost for neighborhood queries, which allow it to support most algorithms in graphs without extra cost. We also present a persistent structure based on Dolha that has the ability to handle the sliding window update and time related queries.

Data Structures and Algorithms

comments

Fetching comments

Institut National d'Administration

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Monotonically relaxing concurrent data-structure semantics for performance: An efficient 2D design framework

Ask ChatGPT about the research

No Arabic abstract

Read More