No Arabic abstract
ATLAS event data processing requires access to non-event data (detector conditions, calibrations, etc.) stored in relational databases. The database-resident data are crucial for the event data reconstruction processing steps and often required for user analysis. A main focus of ATLAS database operations is on the worldwide distribution of the Conditions DB data, which are necessary for every ATLAS data processing job. Since Conditions DB access is critical for operations with real data, we have developed the system where a different technology can be used as a redundant backup. Redundant database operations infrastructure fully satisfies the requirements of ATLAS reprocessing, which has been proven on a scale of one billion database queries during two reprocessing campaigns of 0.5 PB of single-beam and cosmics data on the Grid. To collect experience and provide input for a best choice of technologies, several promising options for efficient database access in user analysis were evaluated successfully. We present ATLAS experience with scalable database access technologies and describe our approach for prevention of database access bottlenecks in a Grid computing environment.
The ATLAS experiment at the Large Hadron Collider has implemented a new system for recording information on detector status and data quality, and for transmitting this information to users performing physics analysis. This system revolves around the concept of defects, which are well-defined, fine-grained, unambiguous occurrences affecting the quality of recorded data. The motivation, implementation, and operation of this system is described.
For efficiency of the large production tasks distributed worldwide, it is essential to provide shared production management tools comprised of integratable and interoperable services. To enhance the ATLAS DC1 production toolkit, we introduced and tested a Virtual Data services component. For each major data transformation step identified in the ATLAS data processing pipeline (event generation, detector simulation, background pile-up and digitization, etc) the Virtual Data Cookbook (VDC) catalogue encapsulates the specific data transformation knowledge and the validated parameters settings that must be provided before the data transformation invocation. To provide for local-remote transparency during DC1 production, the VDC database server delivered in a controlled way both the validated production parameters and the templated production recipes for thousands of the event generation and detector simulation jobs around the world, simplifying the production management solutions.
A large class of traditional graph and data mining algorithms can be concisely expressed in Datalog, and other Logic-based languages, once aggregates are allowed in recursion. In fact, for most BigData algorithms, the difficult semantic issues raised by the use of non-monotonic aggregates in recursion are solved by Pre-Mappability (PreM), a property that assures that for a program with aggregates in recursion there is an equivalent aggregate-stratified program. In this paper we show that, by bringing together the formal abstract semantics of stratified programs with the efficient operational one of unstratified programs, PreM can also facilitate and improve their parallel execution. We prove that PreM-optimized lock-free and decomposable parallel semi-naive evaluations produce the same results as the single executor programs. Therefore, PreM can be assimilated into the data-parallel computation plans of different distributed systems, irrespective of whether these follow bulk synchronous parallel (BSP) or asynchronous computing models. In addition, we show that non-linear recursive queries can be evaluated using a hybrid stale synchronous parallel (SSP) model on distributed environments. After providing a formal correctness proof for the recursive query evaluation with PreM under this relaxed synchronization model, we present experimental evidence of its benefits. This paper is under consideration for acceptance in Theory and Practice of Logic Programming (TPLP).
Modern audience measurement requires combining observations from disparate panel datasets. Connecting and relating such panel datasets is a process termed panel fusion. This paper formalizes the panel fusion problem and presents a novel approach to solve it. We cast the panel fusion as a network flow problem, allowing the application of a rich body of research. In the context of digital audience measurement, where panel sizes can grow into the tens of millions, we propose an efficient algorithm to partition the network into sub-problems. While the algorithm solves a relaxed version of the original problem, we provide conditions under which it guarantees optimality. We demonstrate our approach by fusing two real-world panel datasets in a distributed computing environment.
The Compact Linear Collider (CLIC) is a high-energy high-luminosity linear electron-positron collider under development. It is foreseen to be built and operated in three stages, at centre-of-mass energies of 380 GeV, 1.5 TeV and 3 TeV, respectively. It offers a rich physics program including direct searches as well as the probing of new physics through a broad set of precision measurements of Standard Model processes, particularly in the Higgs-boson and top-quark sectors. The precision required for such measurements and the specific conditions imposed by the beam dimensions and time structure put strict requirements on the detector design and technology. This includes low-mass vertexing and tracking systems with small cells, highly granular imaging calorimeters, as well as a precise hit-time resolution and power-pulsed operation for all subsystems. A conceptual design for the CLIC detector system was published in 2012. Since then, ambitious R&D programmes for silicon vertex and tracking detectors, as well as for calorimeters have been pursued within the CLICdp, CALICE and FCAL collaborations, addressing the challenging detector requirements with innovative technologies. This report introduces the experimental environment and detector requirements at CLIC and reviews the current status and future plans for detector technology R&D.