ترغب بنشر مسار تعليمي؟ اضغط هنا

ProMC: Input-output data format for HEP applications using varint encoding

109   0   0.0 ( 0 )
 نشر من قبل Sergei Chekanov V.
 تاريخ النشر 2013
والبحث باللغة English




اسأل ChatGPT حول البحث

A new data format for Monte Carlo (MC) events, or any structural data, including experimental data, is discussed. The format is designed to store data in a compact binary form using variable-size integer encoding as implemented in the Googles Protocol Buffers package. This approach is implemented in the ProMC library which produces smaller file sizes for MC records compared to the existing input-output libraries used in high-energy physics (HEP). Other important features of the proposed format are a separation of abstract data layouts from concrete programming implementations, self-description and random access. Data stored in ProMC files can be written, read and manipulated in a number of programming languages, such C++, JAVA, FORTRAN and PYTHON.



قيم البحث

اقرأ أيضاً

64 - R. Sobie 2003
A Grid testbed has been established using resources at 12 sites across Canada involving researchers from particle physics as well as other fields of science. We describe our use of the testbed with the BaBar Monte Carlo production and the ATLAS data challenge software. In each case the remote sites have no application-specific software stored locally and instead access the software and data via AFS and/or GridFTP from servers located in Victoria. In the case of BaBar, an Objectivity database server was used for data storage. We present the results of a series of initial tests of the Grid testbed using both BaBar and ATLAS applications. The initial results demonstrate the feasibility of using generic Grid resources for HEP applications.
This draft report summarizes and details the findings, results, and recommendations derived from the ASCR/HEP Exascale Requirements Review meeting held in June, 2015. The main conclusions are as follows. 1) Larger, more capable computing and data fac ilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude -- and in some cases greater -- than that available currently. 2) The growth rate of data produced by simulations is overwhelming the current ability, of both facilities and researchers, to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. 3) Data rates and volumes from HEP experimental facilities are also straining the ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. 4) A close integration of HPC simulation and data analysis will aid greatly in interpreting results from HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. 5) Long-range planning between HEP and ASCR will be required to meet HEPs research needs. To best use ASCR HPC resources the experimental HEP program needs a) an established long-term plan for access to ASCR computational and data resources, b) an ability to map workflows onto HPC resources, c) the ability for ASCR facilities to accommodate workflows run by collaborations that can have thousands of individual members, d) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, e) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.
112 - David Lange 2018
There are numerous approaches to building analysis applications across the high-energy physics community. Among them are Python-based, or at least Python-driven, analysis workflows. We aim to ease the adoption of a Python-based analysis toolkit by ma king it easier for non-expert users to gain access to Python tools for scientific analysis. Experimental software distributions and individual user analysis have quite different requirements. Distributions tend to worry most about stability, usability and reproducibility, while the users usually strive to be fast and nimble. We discuss how we built and now maintain a python distribution for analysis while satisfying requirements both a large software distribution (in our case, that of CMSSW) and user, or laptop, level analysis. We pursued the integration of tools used by the broader data science community as well as HEP developed (e.g., histogrammar, root_numpy) Python packages. We discuss concepts we investigated for package integration and testing, as well as issues we encountered through this process. Distribution and platform support are important topics. We discuss our approach and progress towards a sustainable infrastructure for supporting this Python stack for the CMS user community and for the broader HEP user community.
85 - Eduardo Rodrigues 2019
The Scikit-HEP project is a community-driven and community-oriented effort with the aim of providing Particle Physics at large with a Python scientific toolset containing core and common tools. The project builds on five pillars that embrace the majo r topics involved in a physicists analysis work: datasets, data aggregations, modelling, simulation and visualisation. The vision is to build a user and developer community engaging collaboration across experiments, to emulate scikit-learns unified interface with Astropys embrace of third-party packages, and to improve discoverability of relevant tools.
Norm-conserving pseudopotentials are used by a significant number of electronic-structure packages, but the practical differences among codes in the handling of the associated data hinder their interoperability and make it difficult to compare their results. At the same time, existing formats lack provenance data, which makes it difficult to track and document computational workflows. To address these problems, we first propose a file format (PSML) that maps the basic concepts of the norm-conserving pseudopotential domain in a flexible form and supports the inclusion of provenance information and other important metadata. Second, we provide a software library (libPSML) that can be used by electronic structure codes to transparently extract the information in the file and adapt it to their own data structures, or to create converters for other formats. Support for the new file format has been already implemented in several pseudopotential generator programs (including ATOM and ONCVPSP), and the library has been linked with Siesta and Abinit, allowing them to work with the same pseudopotential operator (with the same local part and fully non-local projectors) thus easing the comparison of their results for the structural and electronic properties, as shown for several example systems. This methodology can be easily transferred to any other package that uses norm-conserving pseudopotentials, and offers a proof-of-concept for a general approach to interoperability.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا