مساحة جديدة

اشترك بالحزمة الذهبية واحصل على وصول غير محدود شمرا أكاديميا

تسجيل مستخدم جديد

Concurrent Cuba

47 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Thomas Hahn

تاريخ النشر 2014

مجال البحث فيزياء الهندسة المعلوماتية

والبحث باللغة English

تأليف T. Hahn

الفيزياء الحسابية البرمجيات الرياضية فيزياء الطاقة العالية - الظواهر

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

The parallel version of the multidimensional numerical integration package Cuba is presented and achievable speed-ups discussed.

قيم البحث

138 - Marek Blazewicz 2013

Starting from a high-level problem description in terms of partial differential equations using abstract tensor notation, the Chemora framework discretizes, optimizes, and generates complete high performance codes for a wide range of compute architec tures. Chemora extends the capabilities of Cactus, facilitating the usage of large-scale CPU/GPU systems in an efficient manner for complex applications, without low-level code tuning. Chemora achieves parallelism through MPI and multi-threading, combining OpenMP and CUDA. Optimizations include high-level code transformations, efficient loop traversal strategies, dynamically selected data and instruction cache usage strategies, and JIT compilation of GPU code tailored to the problem characteristics. The discretization is based on higher-order finite differences on multi-block domains. Chemoras capabilities are demonstrated by simulations of black hole collisions. This problem provides an acid test of the framework, as the Einstein equations contain hundreds of variables and thousands of terms.

الفيزياء الحسابية البرمجيات الرياضية النسبية العامة وهدية الكونيات الكم

Semi-Lagrangian Vlasov simulation on GPUs

325 - Lukas Einkemmer 2019

In this paper, our goal is to efficiently solve the Vlasov equation on GPUs. A semi-Lagrangian discontinuous Galerkin scheme is used for the discretization. Such kinetic computations are extremely expensive due to the high-dimensional phase space. Th e SLDG code, which is publicly available under the MIT license abstracts the number of dimensions and uses a shared codebase for both GPU and CPU based simulations. We investigate the performance of the implementation on a range of both Tesla (V100, Titan V, K80) and consumer (GTX 1080 Ti) GPUs. Our implementation is typically able to achieve a performance of approximately 470 GB/s on a single GPU and 1600 GB/s on four V100 GPUs connected via NVLink. This results in a speedup of about a factor of ten (comparing a single GPU with a dual socket Intel Xeon Gold node) and approximately a factor of 35 (comparing a single node with and without GPUs). In addition, we investigate the effect of single precision computation on the performance of the SLDG code and demonstrate that a template based dimension independent implementation can achieve good performance regardless of the dimensionality of the problem.

الفيزياء الحسابية البرمجيات الرياضية

GeantV: Results from the prototype of concurrent vector particle transport simulation in HEP

71 - G. Amadio , A. Ananya , J. Apostolakis 2020

Full detector simulation was among the largest CPU consumer in all CERN experiment software stacks for the first two runs of the Large Hadron Collider (LHC). In the early 2010s, the projections were that simulation demands would scale linearly with l uminosity increase, compensated only partially by an increase of computing resources. The extension of fast simulation approaches to more use cases, covering a larger fraction of the simulation budget, is only part of the solution due to intrinsic precision limitations. The remainder corresponds to speeding-up the simulation software by several factors, which is out of reach using simple optimizations on the current code base. In this context, the GeantV R&D project was launched, aiming to redesign the legacy particle transport codes in order to make them benefit from fine-grained parallelism features such as vectorization, but also from increased code and data locality. This paper presents extensively the results and achievements of this R&D, as well as the conclusions and lessons learnt from the beta prototype.

الفيزياء الحسابية فيزياء الطاقة العالية - التجربة

DP-GEN: A concurrent learning platform for the generation of reliable deep learning based potential energy models

49 - Yuzhi Zhang , Haidi Wang , Weijie Chen 2019

In recent years, promising deep learning based interatomic potential energy surface (PES) models have been proposed that can potentially allow us to perform molecular dynamics simulations for large scale systems with quantum accuracy. However, making these models truly reliable and practically useful is still a very non-trivial task. A key component in this task is the generation of datasets used in model training. In this paper, we introduce the Deep Potential GENerator (DP-GEN), an open-source software platform that implements the recently proposed on-the-fly learning procedure [Phys. Rev. Materials 3, 023804] and is capable of generating uniformly accurate deep learning based PES models in a way that minimizes human intervention and the computational cost for data generation and model training. DP-GEN automatically and iteratively performs three steps: exploration, labeling, and training. It supports various popular packages for these three steps: LAMMPS for exploration, Quantum Espresso, VASP, CP2K, etc. for labeling, and DeePMD-kit for training. It also allows automatic job submission and result collection on different types of machines, such as high performance clusters and cloud machines, and is adaptive to different job management tools, including Slurm, PBS, and LSF. As a concrete example, we illustrate the details of the process for generating a general-purpose PES model for Cu using DP-GEN.

الفيزياء الحسابية

Concurrent J-Evolving Refocusing Pulses

52 - Sebastian Ehni , Martin R. M. Koos , Tony Reinsperger 2021

Conventional refocusing pulses are optimised for a single spin without considering any type of coupling. However, despite the fact that most couplings will result in undesired distortions, refocusing in delay-pulse-delay-type sequences with desired h eteronuclear coherence transfer might be enhanced considerably by including coupling evolution into pulse design. We provide a proof of principle study for a Hydrogen-Carbon refocusing pulse sandwich with inherent J-evolution following the previously reported ICEBERG-principle with improved performance in terms of refocusing performance and/or overall effective coherence transfer time. Pulses are optimised using optimal control theory with a newly derived quality factor and z-controls as an efficient tool to speed up calculations. Pulses are characterised in detail and compared to conventional concurrent refocusing pulses, clearly showing an improvement for the J-evolving pulse sandwich. As a side-product, also efficient J-compensated refocusing pulse sandwiches -- termed BUBU pulses following the nomenclature of previous J-compensated BUBI and BEBE(tr) pulse sandwiches -- have been optimised.

الفيزياء الكيميائية

سجل دخول لتتمكن من نشر تعليقات

التعليقات

جاري جلب التعليقات

سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها

جامعة الشھباء الخاصة

تفاصيل إضافية المزيد من الجامعات

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Concurrent Cuba

اسأل ChatGPT حول البحث

ﻻ يوجد ملخص باللغة العربية

The parallel version of the multidimensional numerical integration package Cuba is presented and achievable speed-ups discussed.

اقرأ أيضاً