Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Petascale Computational Systems

128 0 0.0 ( 0 )

Download Cite

Added by Jim Gray

Publication date 2007

fields Informatics Engineering

and research's language is English

Authors Gordon Bell - Jim Gray - Alex Szalay

Databases Hardware Architecture

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Computational science is changing to be data intensive. Super-Computers must be balanced systems; not just CPU farms but also petascale IO and networking arrays. Anyone building CyberInfrastructure should allocate resources to support a balanced Tier-1 through Tier-3 design.

rate research

A Massive Data Parallel Computational Framework for Petascale/Exascale Hybrid Computer Systems

525 - Marek Blazewicz , Steven R. Brandt , Peter Diener 2012

Heterogeneous systems are becoming more common on High Performance Computing (HPC) systems. Even using tools like CUDA and OpenCL it is a non-trivial task to obtain optimal performance on the GPU. Approaches to simplifying this task include Merge (a library based framework for heterogeneous multi-core systems), Zippy (a framework for parallel execution of codes on multiple GPUs), BSGP (a new programming language for general purpose computation on the GPU) and CUDA-lite (an enhancement to CUDA that transforms code based on annotations). In addition, efforts are underway to improve compiler tools for automatic parallelization and optimization of affine loop nests for GPUs and for automatic translation of OpenMP parallelized codes to CUDA. In this paper we present an alternative approach: a new computational framework for the development of massively data parallel scientific codes applications suitable for use on such petascale/exascale hybrid systems built upon the highly scalable Cactus framework. As the first non-trivial demonstration of its usefulness, we successfully developed a new 3D CFD code that achieves improved performance.

Distributed Parallel and Cluster Computing

A Datalog-based Computational Model for Coordination-free, Data-Parallel Systems

397 - Matteo Interlandi , Letizia Tanca 2014

Cloud computing refers to maximizing efficiency by sharing computational and storage resources, while data-parallel systems exploit the resources available in the cloud to perform parallel transformations over large amounts of data. In the same line, considerable emphasis has been recently given to two apparently disjoint research topics: data-parallel, and eventually consistent, distributed systems. Declarative networking has been recently proposed to ease the task of programming in the cloud, by allowing the programmer to express only the desired result and leave the implementation details to the responsibility of the run-time system. In this context, we propose a study on a logic-programming-based computational model for eventually consistent, data-parallel systems, the keystone of which is provided by the recent finding that the class of programs that can be computed in an eventually consistent, coordination-free way is that of monotonic programs. This principle is called CALM and has been proven by Ameloot et al. for distributed, asynchronous settings. We advocate that CALM should be employed as a basic theoretical tool also for data-parallel systems, wherein computation usually proceeds synchronously in rounds and where communication is assumed to be reliable. It is general opinion that coordination-freedom can be seen as a major discriminant factor. In this work we make the case that the current form of CALM does not hold in general for data-parallel systems, and show how, using novel techniques, the satisfiability of the CALM principle can still be obtained although just for the subclass of programs called connected monotonic queries. We complete the study with considerations on the relationships between our model and the one employed by Ameloot et al., showing that our techniques subsume the latter when the synchronization constraints imposed on the system are loosened.

Databases

Towards Loosely-Coupled Programming on Petascale Systems

554 - Ioan Raicu , Zhao Zhang , Mike Wilde 2008

We have extended the Falkon lightweight task execution framework to make loosely coupled programming on petascale systems a practical and useful programming model. This work studies and measures the performance factors involved in applying this approach to enable the use of petascale systems by a broader user community, and with greater ease. Our work enables the execution of highly parallel computations composed of loosely coupled serial jobs with no modifications to the respective applications. This approach allows a new-and potentially far larger-class of applications to leverage petascale systems, such as the IBM Blue Gene/P supercomputer. We present the challenges of I/O performance encountered in making this model practical, and show results using both microbenchmarks and real applications from two domains: economic energy modeling and molecular dynamics. Our benchmarks show that we can scale up to 160K processor-cores with high efficiency, and can achieve sustained execution rates of thousands of tasks per second.

Distributed Parallel and Cluster Computing

Efficiently Reclaiming Space in a Log Structured Store

167 - David Lomet 2020

A log structured store uses a single write I/O for a number of diverse and non-contiguous pages within a large buffer instead of using a write I/O for each page separately. This requires that pages be relocated on every write, because pages are never updated in place. Instead, pages are dynamically remapped on every write. Log structuring was invented for and used initially in file systems. Today, a form of log structuring is used in SSD controllers because an SSD requires the erasure of a large block of pages before flash storage can be reused. No update-in-place requires that the storage for out-of-date pages be reclaimed (garbage collected or cleaned). We analyze cleaning performance and introduce a cleaning strategy that uses a new way to prioritize the order in which stale pages are garbage collected. Our cleaning strategy approximates an optimal cleaning strategy. Simulation studies confirm the results of the analysis. This strategy is a significant improvement over previous cleaning strategies.

Databases Hardware Architecture Performance

G-TADOC: Enabling Efficient GPU-Based Text Analytics without Decompression

207 - Feng Zhang , Zaifeng Pan , Yanliang Zhou 2021

Text analytics directly on compression (TADOC) has proven to be a promising technology for big data analytics. GPUs are extremely popular accelerators for data analytics systems. Unfortunately, no work so far shows how to utilize GPUs to accelerate TADOC. We describe G-TADOC, the first framework that provides GPU-based text analytics directly on compression, effectively enabling efficient text analytics on GPUs without decompressing the input data. G-TADOC solves three major challenges. First, TADOC involves a large amount of dependencies, which makes it difficult to exploit massive parallelism on a GPU. We develop a novel fine-grained thread-level workload scheduling strategy for GPU threads, which partitions heavily-dependent loads adaptively in a fine-grained manner. Second, in developing G-TADOC, thousands of GPU threads writing to the same result buffer leads to inconsistency while directly using locks and atomic operations lead to large synchronization overheads. We develop a memory pool with thread-safe data structures on GPUs to handle such difficulties. Third, maintaining the sequence information among words is essential for lossless compression. We design a sequence-support strategy, which maintains high GPU parallelism while ensuring sequence information. Our experimental evaluations show that G-TADOC provides 31.1x average speedup compared to state-of-the-art TADOC.

Databases Hardware Architecture

comments

Fetching comments

King AbdulAziz University

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Petascale Computational Systems

Ask ChatGPT about the research

No Arabic abstract

Computational science is changing to be data intensive. Super-Computers must be balanced systems; not just CPU farms but also petascale IO and networking arrays. Anyone building CyberInfrastructure should allocate resources to support a balanced Tier-1 through Tier-3 design.

Read More