We describe likelihood-based statistical tests for use in high energy physics for the discovery of new phenomena and for construction of confidence intervals on model parameters. We focus on the properties of the test procedures that allow one to account for systematic uncertainties. Explicit formulae for the asymptotic distributions of test statistics are derived using results of Wilks and Wald. We motivate and justify the use of a representative data set, called the Asimov data set, which provides a simple method to obtain the median experimental sensitivity of a search or measurement as well as fluctuations about this expectation.
Asymptotic formulae for likelihood-based tests of new physics presents a mathematical formalism for a new approximation for hypothesis testing in high energy physics. The approximations are designed to greatly reduce the computational burden for such problems. We seek to test the conditions under which the approximations described remain valid. To do so, we perform parallel calculations for a range of scenarios and compare the full calculation to the approximations to determine the limits and robustness of the approximation. We compare this approximation against values calculated with the Collie framework, which for our analysis we assume produces true values.
We present the asymptotic distribution for two-sided tests based on the profile likelihood ratio with lower and upper boundaries on the parameter of interest. This situation is relevant for branching ratios and the elements of unitary matrices such as the CKM matrix.
texttt{GooStats} is a software framework that provides a flexible environment and common tools to implement multi-variate statistical analysis. The framework is built upon the texttt{CERN ROOT}, texttt{MINUIT} and texttt{GooFit} packages. Running a multi-variate analysis in parallel on graphics processing units yields a huge boost in performance and opens new possibilities. The design and benchmark of texttt{GooStats} are presented in this article along with illustration of its application to statistical problems.
We consider whether the asymptotic distributions for the log-likelihood ratio test statistic are expected to be Gaussian or chi-squared. Two straightforward examples provide insight on the difference.
We present an introduction to some concepts of Bayesian data analysis in the context of atomic physics. Starting from basic rules of probability, we present the Bayes theorem and its applications. In particular we discuss about how to calculate simple and joint probability distributions and the Bayesian evidence, a model dependent quantity that allows to assign probabilities to different hypotheses from the analysis of a same data set. To give some practical examples, these methods are applied to two concrete cases. In the first example, the presence or not of a satellite line in an atomic spectrum is investigated. In the second example, we determine the most probable model among a set of possible profiles from the analysis of a statistically poor spectrum. We show also how to calculate the probability distribution of the main spectral component without having to determine uniquely the spectrum modeling. For these two studies, we implement the program Nested fit to calculate the different probability distributions and other related quantities. Nested fit is a Fortran90/Python code developed during the last years for analysis of atomic spectra. As indicated by the name, it is based on the nested algorithm, which is presented in details together with the program itself.