بحث متقدم مدعوم من الذكاء الصنعي

مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

BSMBench: a flexible and scalable supercomputer benchmark from computational particle physics

449 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ed Bennett

تاريخ النشر 2014

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ed Bennett - Luigi Del Debbio - Kirk Jordan

النظم الموزعة والتوازية والحوسبة العنقودية فيزياء الطاقة العالية - شعرية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Lattice Quantum ChromoDynamics (QCD), and by extension its parent field, Lattice Gauge Theory (LGT), make up a significant fraction of supercomputing cycles worldwide. As such, it would be irresponsible not to evaluate machines suitability for such applications. To this end, a benchmark has been developed to assess the performance of LGT applications on modern HPC platforms. Distinct from previous QCD-based benchmarks, this allows probing the behaviour of a variety of theories, which allows varying the ratio of demands between on-node computations and inter-node communications. The results of testing this benchmark on various recent HPC platforms are presented, and directions for future development are discussed.

قيم البحث

66 - Mark Hamilton , Sudarshan Raghunathan , Akshaya Annavajhala 2018

In this work we detail a novel open source library, called MMLSpark, that combines the flexible deep learning library Cognitive Toolkit, with the distributed computing framework Apache Spark. To achieve this, we have contributed Java Language binding s to the Cognitive Toolkit, and added several new components to the Spark ecosystem. In addition, we also integrate the popular image processing library OpenCV with Spark, and present a tool for the automated generation of PySpark wrappers from any SparkML estimator and use this tool to expose all work to the PySpark ecosystem. Finally, we provide a large library of tools for working and developing within the Spark ecosystem. We apply this work to the automated classification of Snow Leopards from camera trap images, and provide an end to end solution for the non-profit conservation organization, the Snow Leopard Trust.

النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي

BigDataBench: A Scalable and Unified Big Data and AI Benchmark Suite

140 - Wanling Gao , Jianfeng Zhan , Lei Wang 2018

Several fundamental changes in technology indicate domain-specific hardware and software co-design is the only path left. In this context, architecture, system, data management, and machine learning communities pay greater attention to innovative big data and AI algorithms, architecture, and systems. Unfortunately, complexity, diversity, frequently-changed workloads, and rapid evolution of big data and AI systems raise great challenges. First, the traditional benchmarking methodology that creates a new benchmark or proxy for every possible workload is not scalable, or even impossible for Big Data and AI benchmarking. Second, it is prohibitively expensive to tailor the architecture to characteristics of one or more application or even a domain of applications. We consider each big data and AI workload as a pipeline of one or more classes of units of computation performed on different initial or intermediate data inputs, each class of which we call a data motif. On the basis of our previous work that identifies eight data motifs taking up most of the run time of a wide variety of big data and AI workloads, we propose a scalable benchmarking methodology that uses the combination of one or more data motifs---to represent diversity of big data and AI workloads. Following this methodology, we present a unified big data and AI benchmark suite---BigDataBench 4.0, publicly available from~url{http://prof.ict.ac.cn/BigDataBench}. This unified benchmark suite sheds new light on domain-specific hardware and software co-design: tailoring the system and architecture to characteristics of the unified eight data motifs other than one or more application case by case. Also, for the first time, we comprehensively characterize the CPU pipeline efficiency using the benchmarks of seven workload types in BigDataBench 4.0.

النظم الموزعة والتوازية والحوسبة العنقودية الذكاء الاصطناعي الأداء

RepEx: A Flexible Framework for Scalable Replica Exchange Molecular Dynamics Simulations

111 - Antons Treikalis , Andre Merzky , Haoyuan Chen 2016

Replica Exchange (RE) simulations have emerged as an important algorithmic tool for the molecular sciences. RE simulations involve the concurrent execution of independent simulations which infrequently interact and exchange information. The next set of simulation parameters are based upon the outcome of the exchanges. Typically RE functionality is integrated into the molecular simulation software package. A primary motivation of the tight integration of RE functionality with simulation codes has been performance. This is limiting at multiple levels. First, advances in the RE methodology are tied to the molecular simulation code. Consequently these advances remain confined to the molecular simulation code for which they were developed. Second, it is difficult to extend or experiment with novel RE algorithms, since expertise in the molecular simulation code is typically required. In this paper, we propose the RepEx framework which address these aforementioned shortcomings of existing approaches, while striking the balance between flexibility (any RE scheme) and scalability (tens of thousands of replicas) over a diverse range of platforms. RepEx is designed to use a pilot-job based runtime system and support diverse RE Patterns and Execution Modes. RE Patterns are concerned with synchronization mechanisms in RE simulation, and Execution Modes with spatial and temporal mapping of workload to the CPU cores. We discuss how the design and implementation yield the following primary contributions of the RepEx framework: (i) its ability to support different RE schemes independent of molecular simulation codes, (ii) provide the ability to execute different exchange schemes and replica counts independent of the specific availability of resources, (iii) provide a runtime system that has first-class support for task-level parallelism, and (iv) required scalability along multiple dimensions.

النظم الموزعة والتوازية والحوسبة العنقودية

AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance

58 - Sebastiaan. P. Huber , Spyros Zoupanos , Martin Uhrin 2020

The ever-growing availability of computing power and the sustained development of advanced computational methods have contributed much to recent scientific progress. These developments present new challenges driven by the sheer amount of calculations and data to manage. Next-generation exascale supercomputers will harden these challenges, such that automated and scalable solutions become crucial. In recent years, we have been developing AiiDA (http://www.aiida.net), a robust open-source high-throughput infrastructure addressing the challenges arising from the needs of automated workflow management and data provenance recording. Here, we introduce developments and capabilities required to reach sustained performance, with AiiDA supporting throughputs of tens of thousands processes/hour, while automatically preserving and storing the full data provenance in a relational database making it queryable and traversable, thus enabling high-performance data analytics. AiiDAs workflow language provides advanced automation, error handling features and a flexible plugin model to allow interfacing with any simulation software. The associated plugin registry enables seamless sharing of extensions, empowering a vibrant user community dedicated to making simulations more robust, user-friendly and reproducible.

النظم الموزعة والتوازية والحوسبة العنقودية علم المواد

3D Real-Time Supercomputer Monitoring

165 - Bill Bergeron , Matthew Hubbell , Dylan Sequeira 2021

Supercomputers are complex systems producing vast quantities of performance data from multiple sources and of varying types. Performance data from each of the thousands of nodes in a supercomputer tracks multiple forms of storage, memory, networks, p rocessors, and accelerators. Optimization of application performance is critical for cost effective usage of a supercomputer and requires efficient methods for effectively viewing performance data. The combination of supercomputing analytics and 3D gaming visualization enables real-time processing and visual data display of massive amounts of information that humans can process quickly with little training. Our system fully utilizes the capabilities of modern 3D gaming environments to create novel representations of computing hardware which intuitively represent the physical attributes of the supercomputer while displaying real-time alerts and component utilization. This system allows operators to quickly assess how the supercomputer is being used, gives users visibility into the resources they are consuming, and provides instructors new ways to interactively teach the computing architecture concepts necessary for efficient computing

النظم الموزعة والتوازية والحوسبة العنقودية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشرق الأوسط - الأردن

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

BSMBench: a flexible and scalable supercomputer benchmark from computational particle physics

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

اقرأ أيضاً