No Arabic abstract
A Grid testbed has been established using resources at 12 sites across Canada involving researchers from particle physics as well as other fields of science. We describe our use of the testbed with the BaBar Monte Carlo production and the ATLAS data challenge software. In each case the remote sites have no application-specific software stored locally and instead access the software and data via AFS and/or GridFTP from servers located in Victoria. In the case of BaBar, an Objectivity database server was used for data storage. We present the results of a series of initial tests of the Grid testbed using both BaBar and ATLAS applications. The initial results demonstrate the feasibility of using generic Grid resources for HEP applications.
Workpackage 8 of the European Datagrid project was formed in January 2001 with representatives from the four LHC experiments, and with experiment independent people from five of the six main EDG partners. In September 2002 WP8 was strengthened by the addition of effort from BaBar and D0. The original mandate of WP8 was, following the definition of short- and long-term requirements, to port experiment software to the EDG middleware and testbed environment. A major additional activity has been testing the basic functionality and performance of this environment. This paper reviews experiences and evaluations in the areas of job submission, data management, mass storage handling, information systems and monitoring. It also comments on the problems of remote debugging, the portability of code, and scaling problems with increasing numbers of jobs, sites and nodes. Reference is made to the pioneeering work of Atlas and CMS in integrating the use of the EDG Testbed into their data challenges. A forward look is made to essential software developments within EDG and to the necessary cooperation between EDG and LCG for the LCG prototype due in mid 2003.
The CMS Integration Grid Testbed (IGT) comprises USCMS Tier-1 and Tier-2 hardware at the following sites: the California Institute of Technology, Fermi National Accelerator Laboratory, the University of California at San Diego, and the University of Florida at Gainesville. The IGT runs jobs using the Globus Toolkit with a DAGMan and Condor-G front end. The virtual organization (VO) is managed using VO management scripts from the European Data Grid (EDG). Gridwide monitoring is accomplished using local tools such as Ganglia interfaced into the Globus Metadata Directory Service (MDS) and the agent based Mona Lisa. Domain specific software is packaged and installed using the Distrib ution After Release (DAR) tool of CMS, while middleware under the auspices of the Virtual Data Toolkit (VDT) is distributed using Pacman. During a continuo us two month span in Fall of 2002, over 1 million official CMS GEANT based Monte Carlo events were generated and returned to CERN for analysis while being demonstrated at SC2002. In this paper, we describe the process that led to one of the worlds first continuously available, functioning grids.
A new data format for Monte Carlo (MC) events, or any structural data, including experimental data, is discussed. The format is designed to store data in a compact binary form using variable-size integer encoding as implemented in the Googles Protocol Buffers package. This approach is implemented in the ProMC library which produces smaller file sizes for MC records compared to the existing input-output libraries used in high-energy physics (HEP). Other important features of the proposed format are a separation of abstract data layouts from concrete programming implementations, self-description and random access. Data stored in ProMC files can be written, read and manipulated in a number of programming languages, such C++, JAVA, FORTRAN and PYTHON.
The Scikit-HEP project is a community-driven and community-oriented effort with the aim of providing Particle Physics at large with a Python scientific toolset containing core and common tools. The project builds on five pillars that embrace the major topics involved in a physicists analysis work: datasets, data aggregations, modelling, simulation and visualisation. The vision is to build a user and developer community engaging collaboration across experiments, to emulate scikit-learns unified interface with Astropys embrace of third-party packages, and to improve discoverability of relevant tools.
WorldGrid is an intercontinental testbed spanning Europe and the US integrating architecturally different Grid implementations based on the Globus toolkit. It has been developed in the context of the DataTAG and iVDGL projects, and successfully demonstrated during the WorldGrid demos at IST2002 (Copenhagen) and SC2002 (Baltimore). Two HEP experiments, ATLAS and CMS, successful exploited the WorldGrid testbed for executing jobs simulating the response of their detectors to physics eve nts produced by real collisions expected at the LHC accelerator starting from 2007. This data intensive activity has been run since many years on local dedicated computing farms consisting of hundreds of nodes and Terabytes of disk and tape storage. Within the WorldGrid testbed, for the first time HEP simulation jobs were submitted and run indifferently on US and European resources, despite of their underlying different Grid implementations, and produced data which could be retrieved and further analysed on the submitting machine, or simply stored on the remote resources and registered on a Replica Catalogue which made them available to the Grid for further processing. In this contribution we describe the job submission from Europe for both ATLAS and CMS applications, performed through the GENIUS portal operating on top of an EDG User Interface submitting to an EDG Resource Broker, pointing out the chosen interoperability solutions which made US and European resources equivalent from the applications point of view, the data management in the WorldGrid environment, and the CMS specific production tools which were interfaced to the GENIUS portal.