بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective

73 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Axel Huebl

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية فيزياء

والبحث باللغة English

تأليف Axel Huebl - Rene Widera - Felix Schmitt

الأداء الفيزياء الحسابية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

We implement and benchmark parallel I/O methods for the fully-manycore driven particle-in-cell code PIConGPU. Identifying throughput and overall I/O size as a major challenge for applications on todays and future HPC systems, we present a scaling law characterizing performance bottlenecks in state-of-the-art approaches for data reduction. Consequently, we propose, implement and verify multi-threaded data-transformations for the I/O library ADIOS as a feasible way to trade underutilized host-side compute potential on heterogeneous systems for reduced I/O latency.

قيم البحث

116 - Hermes Senger , Jaime F. de Souza , Edson S. Gomi 2019

We evaluate the performance of Devito, a domain specific language (DSL) for finite differences on Arm ThunderX2 processors. Experiments with two common seismic computational kernels demonstrate that Arm processors can deliver competitive performance compared to other Intel Xeon processors.

الأداء

Mr. Plotter: Unifying Data Reduction Techniques in Storage and Visualization Systems

104 - Sam Kumar , Michael P Andersen , David E. Culler 2021

As the rate of data collection continues to grow rapidly, developing visualization tools that scale to immense data sets is a serious and ever-increasing challenge. Existing approaches generally seek to decouple storage and visualization systems, per forming just-in-time data reduction to transparently avoid overloading the visualizer. We present a new architecture in which the visualizer and data store are tightly coupled. Unlike systems that read raw data from storage, the performance of our system scales linearly with the size of the final visualization, essentially independent of the size of the data. Thus, it scales to massive data sets while supporting interactive performance (sub-100 ms query latency). This enables a new class of visualization clients that automatically manage data, quickly and transparently requesting data from the underlying database without requiring the user to explicitly initiate queries. It lays a groundwork for supporting truly interactive exploration of big data and opens new directions for research on scalable information visualization systems.

قواعد البيانات تفاعل الإنسان والحاسوب

Understanding HPC Benchmark Performance on Intel Broadwell and Cascade Lake Processors

120 - Christie L. Alappat , Johannes Hofmann , Georg Hager 2020

Hardware platforms in high performance computing are constantly getting more complex to handle even when considering multicore CPUs alone. Numerous features and configuration options in the hardware and the software environment that are relevant for performance are not even known to most application users or developers. Microbenchmarks, i.e., simple codes that fathom a particular aspect of the hardware, can help to shed light on such issues, but only if they are well understood and if the results can be reconciled with known facts or performance models. The insight gained from microbenchmarks may then be applied to real applications for performance analysis or optimization. In this paper we investigate two modern Intel x86 server CPU architectures in depth: Broadwell EP and Cascade Lake SP. We highlight relevant hardware configuration settings that can have a decisive impact on code performance and show how to properly measure on-chip and off-chip data transfer bandwidths. The new victim L3 cache of Cascade Lake and its advanced replacement policy receive due attention. Finally we use DGEMM, sparse matrix-vector multiplication, and the HPCG benchmark to make a connection to relevant application scenarios.

الأداء النظم الموزعة والتوازية والحوسبة العنقودية

Lattice QCD on upcoming Arm architectures

197 - Nils Meyer , Dirk Pleiter , Stefan Solbrig 2019

Recently Arm introduced a new instruction set called Scalable Vector Extension (SVE), which supports vector lengths up to 2048 bits. While SVE hardware will not be generally available until about 2021, we believe that future SVE-based architectures w ill have great potential for Lattice QCD. In this contribution we discuss key aspects of SVE and describe how we implemented SVE in the Grid Lattice QCD framework.

فيزياء الطاقة العالية - شعرية الفيزياء الحسابية

Scalability of High-Performance PDE Solvers

116 - Paul Fischer , Misun Min , Thilina Rathnayake 2020

Performance tests and analyses are critical to effective HPC software development and are central components in the design and implementation of computational algorithms for achieving faster simulations on existing and future computing architectures for large-scale application problems. In this paper, we explore performance and space-time trade-offs for important compute-intensive kernels of large-scale numerical solvers for PDEs that govern a wide range of physical applications. We consider a sequence of PDE- motivated bake-off problems designed to establish best practices for efficient high-order simulations across a variety of codes and platforms. We measure peak performance (degrees of freedom per second) on a fixed number of nodes and identify effective code optimization strategies for each architecture. In addition to peak performance, we identify the minimum time to solution at 80% parallel efficiency. The performance analysis is based on spectral and p-type finite elements but is equally applicable to a broad spectrum of numerical PDE discretizations, including finite difference, finite volume, and h-type finite elements.

الأداء النظم الموزعة والتوازية والحوسبة العنقودية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الأندلس للعلوم الطبية

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

On the Scalability of Data Reduction Techniques in Current and Upcoming HPC Systems from an Application Perspective

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً