ترغب بنشر مسار تعليمي؟ اضغط هنا

Model-Independent Detection of New Physics Signals Using Interpretable Semi-Supervised Classifier Tests

84   0   0.0 ( 0 )
 نشر من قبل Purvasha Chakravarti Dr.
 تاريخ النشر 2021
  مجال البحث الاحصاء الرياضي
والبحث باللغة English




اسأل ChatGPT حول البحث

A central goal in experimental high energy physics is to detect new physics signals that are not explained by known physics. In this paper, we aim to search for new signals that appear as deviations from known Standard Model physics in high-dimensional particle physics data. To do this, we determine whether there is any statistically significant difference between the distribution of Standard Model background samples and the distribution of the experimental observations, which are a mixture of the background and a potential new signal. Traditionally, one also assumes access to a sample from a model for the hypothesized signal distribution. Here we instead investigate a model-independent method that does not make any assumptions about the signal and uses a semi-supervised classifier to detect the presence of the signal in the experimental data. We construct three test statistics using the classifier: an estimated likelihood ratio test (LRT) statistic, a test based on the area under the ROC curve (AUC), and a test based on the misclassification error (MCE). Additionally, we propose a method for estimating the signal strength parameter and explore active subspace methods to interpret the proposed semi-supervised classifier in order to understand the properties of the detected signal. We investigate the performance of the methods on a data set related to the search for the Higgs boson at the Large Hadron Collider at CERN. We demonstrate that the semi-supervised tests have power competitive with the classical supervised methods for a well-specified signal, but much higher power for an unexpected signal which might be entirely missed by the supervised tests.



قيم البحث

اقرأ أيضاً

This paper introduces new tests of fundamental physics by means of the analysis of disturbances on the GNSS signal propagation. We show how the GNSS signals are sensitive to a space variation of the fine structure constant $alpha$ in a generic framew ork of effective scalar field theories beyond the Standard Model. This effective variation may originate from the crossing of the RF signals with dark matter clumps and/or solitonic structures. At the macroscopic scale, the subsequent disturbances are equivalent to those which occur during the propagation in an inhomogeneous medium. We thus propose an interpretation of the measure of the vacuum permeability as a test of fundamental physics. We show the relevance of our approach by a first quantification of the expected signature in a simple model of a variation of $alpha$ according to a planar geometry. We use a test-bed model of domain walls for that purpose and focus on the measurable time delay in the GNSS signal carrier.
176 - Lucio Anderlini 2015
Density Estimation Trees can play an important role in exploratory data analysis for multidimensional, multi-modal data models of large samples. I briefly discuss the algorithm, a self-optimization technique based on kernel density estimation, and some applications in High Energy Physics.
A finite mixture model is used to learn trends from the currently available data on coronavirus (COVID-19). Data on the number of confirmed COVID-19 related cases and deaths for European countries and the United States (US) are explored. A semi-super vised clustering approach with positive equivalence constraints is used to incorporate country and state information into the model. The analysis of trends in the rates of cases and deaths is carried out jointly using a mixture of multivariate Gaussian non-linear regression models with a mean trend specified using a generalized logistic function. The optimal number of clusters is chosen using the Bayesian information criterion. The resulting clusters provide insight into different mitigation strategies adopted by US states and European countries. The obtained results help identify the current relative standing of individual states and show a possible future if they continue with the chosen mitigation technique
We propose a new scientific application of unsupervised learning techniques to boost our ability to search for new phenomena in data, by detecting discrepancies between two datasets. These could be, for example, a simulated standard-model background, and an observed dataset containing a potential hidden signal of New Physics. We build a statistical test upon a test statistic which measures deviations between two samples, using a Nearest Neighbors approach to estimate the local ratio of the density of points. The test is model-independent and non-parametric, requiring no knowledge of the shape of the underlying distributions, and it does not bin the data, thus retaining full information from the multidimensional feature space. As a proof-of-concept, we apply our method to synthetic Gaussian data, and to a simulated dark matter signal at the Large Hadron Collider. Even in the case where the background can not be simulated accurately enough to claim discovery, the technique is a powerful tool to identify regions of interest for further study.
We reframe common tasks in jet physics in probabilistic terms, including jet reconstruction, Monte Carlo tuning, matrix element - parton shower matching for large jet multiplicity, and efficient event generation of jets in complex, signal-like region s of phase space. We also introduce Ginkgo, a simplified, generative model for jets, that facilitates research into these tasks with techniques from statistics, machine learning, and combinatorial optimization. We review some of the recent research in this direction that has been enabled with Ginkgo. We show how probabilistic programming can be used to efficiently sample the showering process, how a novel trellis algorithm can be used to efficiently marginalize over the enormous number of clustering histories for the same observed particles, and how dynamic programming, A* search, and reinforcement learning can be used to find the maximum likelihood clustering in this enormous search space. This work builds bridges with work in hierarchical clustering, statistics, combinatorial optmization, and reinforcement learning.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا