ترغب بنشر مسار تعليمي؟ اضغط هنا

Speeding simulation analysis up with yt and Intel Distribution for Python

80   0   0.0 ( 0 )
 نشر من قبل Salvatore Cielo
 تاريخ النشر 2019
والبحث باللغة English




اسأل ChatGPT حول البحث

As modern scientific simulations grow ever more in size and complexity, even their analysis and post-processing becomes increasingly demanding, calling for the use of HPC resources and methods. yt is a parallel, open source post-processing python package for numerical simulations in astrophysics, made popular by its cross-format compatibility, its active community of developers and its integration with several other professional Python instruments. The Intel Distribution for Python enhances yts performance and parallel scalability, through the optimization of lower-level libraries Numpy and Scipy, which make use of the optimized Intel Math Kernel Library (Intel-MKL) and the Intel MPI library for distributed computing. The library package yt is used for several analysis tasks, including integration of derived quantities, volumetric rendering, 2D phase plots, cosmological halo analysis and production of synthetic X-ray observation. In this paper, we provide a brief tutorial for the installation of yt and the Intel Distribution for Python, and the execution of each analysis task. Compared to the Anaconda python distribution, using the provided solution one can achieve net speedups up to 4.6x on Intel Xeon Scalable processors (codename Skylake).



قيم البحث

اقرأ أيضاً

We perform a detailed analysis of the C++ implementation of the Cluster Affiliation Model for Big Networks (BigClam) on the Stanford Network Analysis Project (SNAP). BigClam is a popular graph mining algorithm that is capable of finding overlapping c ommunities in networks containing millions of nodes. Our analysis shows a key stage of the algorithm - determining if a node belongs to a community - dominates the runtime of the implementation, yet the computation is not parallelized. We show that by parallelizing computations across multiple threads using OpenMP we can speed up the algorithm by 5.3 times when solving large networks for communities, while preserving the integrity of the program and the result.
Datacenters provide the infrastructure for cloud computing services used by millions of users everyday. Many such services are distributed over multiple datacenters at geographically distant locations possibly in different continents. These datacente rs are then connected through high speed WAN links over private or public networks. To perform data backups or data synchronization operations, many transfers take place over these networks that have to be completed before a deadline in order to provide necessary service guarantees to end users. Upon arrival of a transfer request, we would like the system to be able to decide whether such a request can be guaranteed successful delivery. If yes, it should provide us with transmission schedule in the shortest time possible. In addition, we would like to avoid packet reordering at the destination as it affects TCP performance. Previous work in this area either cannot guarantee that admitted transfers actually finish before the specified deadlines or use techniques that can result in packet reordering. In this paper, we propose DCRoute, a fast and efficient routing and traffic allocation technique that guarantees transfer completion before deadlines for admitted requests. It assigns each transfer a single path to avoid packet reordering. Through simulations, we show that DCRoute is at least 200 times faster than other traffic allocation techniques based on linear programming (LP) while admitting almost the same amount of traffic to the system.
Python has become the de facto language for scientific computing. Programming in Python is highly productive, mainly due to its rich science-oriented software ecosystem built around the NumPy module. As a result, the demand for Python support in High Performance Computing (HPC) has skyrocketed. However, the Python language itself does not necessarily offer high performance. In this work, we present a workflow that retains Pythons high productivity while achieving portable performance across different architectures. The workflows key features are HPC-oriented language extensions and a set of automatic optimizations powered by a data-centric intermediate representation. We show performance results and scaling across CPU, GPU, FPGA, and the Piz Daint supercomputer (up to 23,328 cores), with 2.47x and 3.75x speedups over previous-best solutions, first-ever Xilinx and Intel FPGA results of annotated Python, and up to 93.16% scaling efficiency on 512 nodes.
127 - Matthew J. Turk 2011
The usage of the high-level scripting language Python has enabled new mechanisms for data interrogation, discovery and visualization of scientific data. We present yt, an open source, community-developed astrophysical analysis and visualization toolk it for data generated by high-performance computing (HPC) simulations of astrophysical phenomena. Through a separation of responsibilities in the underlying Python code, yt allows data generated by incompatible, and sometimes even directly competing, astrophysical simulation platforms to be analyzed in a consistent manner, focusing on physically relevant quantities rather than quantities native to astrophysical simulation codes. We present on its mechanisms for data access, capabilities for MPI-parallel analysis, and its implementation as an in situ analysis and visualization tool.
Data engineering is becoming an increasingly important part of scientific discoveries with the adoption of deep learning and machine learning. Data engineering deals with a variety of data formats, storage, data extraction, transformation, and data m ovements. One goal of data engineering is to transform data from original data to vector/matrix/tensor formats accepted by deep learning and machine learning applications. There are many structures such as tables, graphs, and trees to represent data in these data engineering phases. Among them, tables are a versatile and commonly used format to load and process data. In this paper, we present a distributed Python API based on table abstraction for representing and processing data. Unlike existing state-of-the-art data engineering tools written purely in Python, our solution adopts high performance compute kernels in C++, with an in-memory table representation with Cython-based Python bindings. In the core system, we use MPI for distributed memory computations with a data-parallel approach for processing large datasets in HPC clusters.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا