Report of the HPC Correctness Summit, Jan 25--26, 2017, Washington, DC

75 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Ganesh Gopalakrishnan

تاريخ النشر 2017

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Ganesh Gopalakrishnan - Paul D. Hovland - Costin Iancu

النظم الموزعة والتوازية والحوسبة العنقودية

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Maintaining leadership in HPC requires the ability to support simulations at large scales and fidelity. In this study, we detail one of the most significant productivity challenges in achieving this goal, namely the increasing proclivity to bugs, especially in the face of growing hardware and software heterogeneity and sheer system scale. We identify key areas where timely new research must be proactively begun to address these challenges, and create new correctness tools that must ideally play a significant role even while ramping up toward exacale. We close with the proposal for a two-day workshop in which the problems identified in this report can be more broadly discussed, and specific plans to launch these new research thrusts identified.

قيم البحث

107 - Rafael Ferreira da Silva , Henri Casanova , Kyle Chard 2021

Scientific workflows have been used almost universally across scientific domains, and have underpinned some of the most significant discoveries of the past several decades. Many of these workflows have high computational, storage, and/or communicatio n demands, and thus must execute on a wide range of large-scale platforms, from large clouds to upcoming exascale high-performance computing (HPC) platforms. These executions must be managed using some software infrastructure. Due to the popularity of workflows, workflow management systems (WMSs) have been developed to provide abstractions for creating and executing workflows conveniently, efficiently, and portably. While these efforts are all worthwhile, there are now hundreds of independent WMSs, many of which are moribund. As a result, the WMS landscape is segmented and presents significant barriers to entry due to the hundreds of seemingly comparable, yet incompatible, systems that exist. As a result, many teams, small and large, still elect to build their own custom workflow solution rather than adopt, or build upon, existing WMSs. This current state of the WMS landscape negatively impacts workflow users, developers, and researchers. The Workflows Community Summit was held online on January 13, 2021. The overarching goal of the summit was to develop a view of the state of the art and identify crucial research challenges in the workflow community. Prior to the summit, a survey sent to stakeholders in the workflow community (including both developers of WMSs and users of workflows) helped to identify key challenges in this community that were translated into 6 broad themes for the summit, each of them being the object of a focused discussion led by a volunteer member of the community. This report documents and organizes the wealth of information provided by the participants before, during, and after the summit.

النظم الموزعة والتوازية والحوسبة العنقودية

Intelligent colocation of HPC workloads

115 - Felippe V. Zacarias 2021

Many HPC applications suffer from a bottleneck in the shared caches, instruction execution units, I/O or memory bandwidth, even though the remaining resources may be underutilized. It is hard for developers and runtime systems to ensure that all crit ical resources are fully exploited by a single application, so an attractive technique for increasing HPC system utilization is to colocate multiple applications on the same server. When applications share critical resources, however, contention on shared resources may lead to reduced application performance. In this paper, we show that server efficiency can be improved by first modeling the expected performance degradation of colocated applications based on measured hardware performance counters, and then exploiting the model to determine an optimized mix of colocated applications. This paper presents a new intelligent resource manager and makes the following contributions: (1) a new machine learning model to predict the performance degradation of colocated applications based on hardware counters and (2) an intelligent scheduling scheme deployed on an existing resource manager to enable application co-scheduling with minimum performance degradation. Our results show that our approach achieves performance improvements of 7% (avg) and 12% (max) compared to the standard policy commonly used by existing job managers.

النظم الموزعة والتوازية والحوسبة العنقودية التعلم الآلي الأداء

Beta-decay properties of $^{25}$Si and $^{26}$P

62 - J.-C. Thomas , L. Achouri , J. Aysto 2004

The $beta$-decay properties of the neutron-deficient nuclei $^{25}$Si and $^{26}$P have been investigated at the GANIL/LISE3 facility by means of charged-particle and $gamma$-ray spectroscopy. The decay schemes obtained and the Gamow-Teller strength distributions are compared to shell-model calculations based on the USD interaction. B(GT) values derived from the absolute measurement of the $beta$-decay branching ratios give rise to a quenching factor of the Gamow-Teller strength of 0.6. A precise half-life of 43.7 (6) ms was determined for $^{26}$P, the $beta$- (2)p decay mode of which is described.

التجربة النووية

Workflows Community Summit: Advancing the State-of-the-art of Scientific Workflows Management Systems Research and Development

73 - Rafael Ferreira da Silva , Henri Casanova , Kyle Chard 2021

Scientific workflows are a cornerstone of modern scientific computing, and they have underpinned some of the most significant discoveries of the last decade. Many of these workflows have high computational, storage, and/or communication demands, and thus must execute on a wide range of large-scale platforms, from large clouds to upcoming exascale HPC platforms. Workflows will play a crucial role in the data-oriented and post-Moores computing landscape as they democratize the application of cutting-edge research techniques, computationally intensive methods, and use of new computing platforms. As workflows continue to be adopted by scientific projects and user communities, they are becoming more complex. Workflows are increasingly composed of tasks that perform computations such as short machine learning inference, multi-node simulations, long-running machine learning model training, amongst others, and thus increasingly rely on heterogeneous architectures that include CPUs but also GPUs and accelerators. The workflow management system (WMS) technology landscape is currently segmented and presents significant barriers to entry due to the hundreds of seemingly comparable, yet incompatible, systems that exist. Another fundamental problem is that there are conflicting theoretical bases and abstractions for a WMS. Systems that use the same underlying abstractions can likely be translated between, which is not the case for systems that use different abstractions. More information: https://workflowsri.org/summits/technical

النظم الموزعة والتوازية والحوسبة العنقودية

Study of the $^{25}$Mg(d,p)$^{26}$Mg reaction to constrain the $^{25}$Al(p,$gamma$)$^{26}$Si resonant reaction rates in nova burning conditions

71 - C. B. Hamill , P. J. Woods , D. Kahl 2020

The rate of the $^{25}$Al($p$,$gamma$)$^{26}$Si reaction is one of the few key remaining nuclear uncertainties required for predicting the production of the cosmic $gamma$-ray emitter $^{26}$Al in explosive burning in novae. This reaction rate is dom inated by three key resonances ($J^{pi}=0^{+}$, $1^{+}$ and $3^{+}$) in $^{26}$Si. Only the $3^{+}$ resonance strength has been directly constrained by experiment. A high resolution measurement of the $^{25}$Mg($d$,$p$) reaction was used to determine spectroscopic factors for analog states in the mirror nucleus, $^{26}$Mg. A first spectroscopic factor value is reported for the $0^{+}$ state at 6.256 MeV, and a strict upper limit is set on the value for the $1^{+}$ state at 5.691 MeV, that is incompatible with an earlier ($^{4}$He,$^{3}$He) study. These results are used to estimate proton partial widths, and resonance strengths of analog states in $^{26}$Si contributing to the $^{25}$Al($p$,$gamma$)$^{26}$Si reaction rate in nova burning conditions.

التجربة النووية الفيزياء الفلكية الشمسية والنجوم نظرية نووية

سجل دخول لتتمكن من نشر تعليقات