No Arabic abstract
This note introduces CutLang, a domain specific language that aims to provide a clear, human readable way to define analyses in high energy particle physics (HEP) along with an interpretation framework of that language. A proof of principle (PoP) implementation of the CutLang interpreter, achieved using C++ as a layer over the CERN data analysis framework ROOT, is presently available. This PoP implementation permits writing HEP analyses in an unobfuscated manner, as a set of commands in human readable text files, which are interpreted by the framework at runtime. We describe the main features of CutLang and illustrate its usage with two analysis examples. Initial experience with CutLang has shown that a just-in-time interpretation of a human readable HEP specific language is a practical alternative to analysis writing using compiled languages such as C++.
We present CutLang, an analysis description language and runtime interpreter for high energy collider physics data analyses. An analysis description language is a declerative domain specific language that can express all elements of a data analysis in an easy and unambiguous way. A full-fledged human readable analysis description language, incorporating logical and mathematical expressions, would eliminate many programming difficulties and errors, consequently allowing the scientist to focus on the goal, but not on the tool. In this paper, we discuss the guiding principles and scope of the CutLang language, implementation of the CutLang runtime interpreter and the CutLang framework, and demonstrate an example of top pair reconstruction.
The fifth edition of the Computing Applications in Particle Physics school was held on 3-7 February 2020, at Istanbul University, Turkey. This particular edition focused on the processing of simulated data from the Large Hadron Collider collisions using an Analysis Description Language and its runtime interpreter called CutLang. 24 undergraduate and 6 graduate students were initiated to collider data analysis during the school. After 3 days of lectures and exercises, the students were grouped into teams of 3 or 4 and each team was assigned an analysis publication from ATLAS or CMS experiments. After 1.5 days of independent study, each team was able to reproduce the assigned analysis using CutLang.
Though statistical analyses are centered on research questions and hypotheses, current statistical analysis tools are not. Users must first translate their hypotheses into specific statistical tests and then perform API calls with functions and parameters. To do so accurately requires that users have statistical expertise. To lower this barrier to valid, replicable statistical analysis, we introduce Tea, a high-level declarative language and runtime system. In Tea, users express their study design, any parametric assumptions, and their hypotheses. Tea compiles these high-level specifications into a constraint satisfaction problem that determines the set of valid statistical tests, and then executes them to test the hypothesis. We evaluate Tea using a suite of statistical analyses drawn from popular tutorials. We show that Tea generally matches the choices of experts while automatically switching to non-parametric tests when parametric assumptions are not met. We simulate the effect of mistakes made by non-expert users and show that Tea automatically avoids both false negatives and false positives that could be produced by the application of incorrect statistical tests.
The traditional approach in HEP analysis software is to loop over every event and every object via the ROOT framework. This method follows an imperative paradigm, in which the code is tied to the storage format and steps of execution. A more desirable strategy would be to implement a declarative language, such that the storage medium and execution are not included in the abstraction model. This will become increasingly important to managing the large dataset collected by the LHC and the HL-LHC. A new analysis description language (ADL) inspired by functional programming, FuncADL, was developed using Python as a host language. The expressiveness of this language was tested by implementing example analysis tasks designed to benchmark the functionality of ADLs. Many simple selections are expressible in a declarative way with FuncADL, which can be used as an interface to retrieve filtered data. Some limitations were identified, but the design of the language allows for future extensions to add missing features. FuncADL is part of a suite of analysis software tools being developed by the Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP). These tools will be available to develop highly scalable physics analyses for the LHC.
Today, both particle physics and cosmology are described by few parameter Standard Models, i.e. it is possible to deduce consequence of particle physics in cosmology and vice verse. The former is examined in this lecture, in light of the recent systematic exploration of the electroweak scale by the LHC experiments. The two main results of the first phase of the LHC, the discovery of a Higgs-like particle and the absence so far of new particles predicted by natural theories beyond the Standard Model (supersymmetry, extra-dimension and composite Higgs) are put in a historical context to enlighten their importance and then presented extensively. To be complete, a short review from the neutrino physics, which can not be probed at LHC, is also given. The ability of all these results to resolve the 3 fundamental questions of cosmology about the nature of dark energy and dark matter as well as the origin of matter-antimatter asymmetry is discussed in each case.