No Arabic abstract
PAX (Physics Analysis Expert) is a novel, C++ based toolkit designed to assist teams in particle physics data analysis issues. The core of PAX are event interpretation containers, holding relevant information about and possible interpretations of a physics event. Providing this new level of abstraction beyond the results of the detector reconstruction programs, PAX facilitates the buildup and use of modern analysis factories. Class structure and user command syntax of PAX are set up to support expert teams as well as newcomers in preparing for the challenges expected to arise in the data analysis at future hadron colliders.
VISPA is a novel development environment for high energy physics analyses, based on a combination of graphical and textual steering. The primary aim of VISPA is to support physicists in prototyping, performing, and verifying a data analysis of any complexity. We present example screenshots, and describe the underlying software concepts.
We propose a new scientific application of unsupervised learning techniques to boost our ability to search for new phenomena in data, by detecting discrepancies between two datasets. These could be, for example, a simulated standard-model background, and an observed dataset containing a potential hidden signal of New Physics. We build a statistical test upon a test statistic which measures deviations between two samples, using a Nearest Neighbors approach to estimate the local ratio of the density of points. The test is model-independent and non-parametric, requiring no knowledge of the shape of the underlying distributions, and it does not bin the data, thus retaining full information from the multidimensional feature space. As a proof-of-concept, we apply our method to synthetic Gaussian data, and to a simulated dark matter signal at the Large Hadron Collider. Even in the case where the background can not be simulated accurately enough to claim discovery, the technique is a powerful tool to identify regions of interest for further study.
We reframe common tasks in jet physics in probabilistic terms, including jet reconstruction, Monte Carlo tuning, matrix element - parton shower matching for large jet multiplicity, and efficient event generation of jets in complex, signal-like regions of phase space. We also introduce Ginkgo, a simplified, generative model for jets, that facilitates research into these tasks with techniques from statistics, machine learning, and combinatorial optimization. We review some of the recent research in this direction that has been enabled with Ginkgo. We show how probabilistic programming can be used to efficiently sample the showering process, how a novel trellis algorithm can be used to efficiently marginalize over the enormous number of clustering histories for the same observed particles, and how dynamic programming, A* search, and reinforcement learning can be used to find the maximum likelihood clustering in this enormous search space. This work builds bridges with work in hierarchical clustering, statistics, combinatorial optmization, and reinforcement learning.
Our predictions for particle physics processes are realized in a chain of complex simulators. They allow us to generate high-fidelity simulated data, but they are not well-suited for inference on the theory parameters with observed data. We explain why the likelihood function of high-dimensional LHC data cannot be explicitly evaluated, why this matters for data analysis, and reframe what the field has traditionally done to circumvent this problem. We then review new simulation-based inference methods that let us directly analyze high-dimensional data by combining machine learning techniques and information from the simulator. Initial studies indicate that these techniques have the potential to substantially improve the precision of LHC measurements. Finally, we discuss probabilistic programming, an emerging paradigm that lets us extend inference to the latent process of the simulator.
The determination of the fundamental parameters of the Standard Model (and its extensions) is often limited by the presence of statistical and theoretical uncertainties. We present several models for the latter uncertainties (random, nuisance, external) in the frequentist framework, and we derive the corresponding $p$-values. In the case of the nuisance approach where theoretical uncertainties are modeled as biases, we highlight the important, but arbitrary, issue of the range of variation chosen for the bias parameters. We introduce the concept of adaptive $p$-value, which is obtained by adjusting the range of variation for the bias according to the significance considered, and which allows us to tackle metrology and exclusion tests with a single and well-defined unified tool, which exhibits interesting frequentist properties. We discuss how the determination of fundamental parameters is impacted by the model chosen for theoretical uncertainties, illustrating several issues with examples from quark flavour physics.