The statistical methods used in deriving physics results in the BaBar collaboration are reviewed, with especial emphasis on areas where practice is not uniform in particle physics.
The many ways in which machine and deep learning are transforming the analysis and simulation of data in particle physics are reviewed. The main methods based on boosted decision trees and various types of neural networks are introduced, and cutting-
edge applications in the experimental and theoretical/phenomenological domains are highlighted. After describing the challenges in the application of these novel analysis techniques, the review concludes by discussing the interactions between physics and machine learning as a two-way street enriching both disciplines and helping to meet the present and future challenges of data-intensive science at the energy and intensity frontiers.
Event occurrence is not only subject to the environmental changes, but is also facilitated by the events that have occurred in a system. Here, we develop a method for estimating such extrinsic and intrinsic factors from a single series of event-occur
rence times. The analysis is performed using a model that combines the inhomogeneous Poisson process and the Hawkes process, which represent exogenous fluctuations and endogenous chain-reaction mechanisms, respectively. The model is fit to a given dataset by minimizing the free energy, for which statistical physics and a path-integral method are utilized. Because the process of event occurrence is stochastic, parameter estimation is inevitably accompanied by errors, and it can ultimately occur that exogenous and endogenous factors cannot be captured even with the best estimator. We obtained four regimes categorized according to whether respective factors are detected. By applying the analytical method to real time series of debate in a social-networking service, we have observed that the estimated exogenous and endogenous factors are close to the first comments and the follow-up comments, respectively. This method is general and applicable to a variety of data, and we have provided an application program, by which anyone can analyze any series of event times.
texttt{GooStats} is a software framework that provides a flexible environment and common tools to implement multi-variate statistical analysis. The framework is built upon the texttt{CERN ROOT}, texttt{MINUIT} and texttt{GooFit} packages. Running a m
ulti-variate analysis in parallel on graphics processing units yields a huge boost in performance and opens new possibilities. The design and benchmark of texttt{GooStats} are presented in this article along with illustration of its application to statistical problems.
Modern analysis of high energy physics (HEP) data needs advanced statistical tools to separate signal from background. A C++ package has been implemented to provide such tools for the HEP community. The package includes linear and quadratic discrimin
ant analysis, decision trees, bump hunting (PRIM), boosting (AdaBoost), bagging and random forest algorithms, and interfaces to the standard backpropagation neural net and radial basis function neural net implemented in the Stuttgart Neural Network Simulator. Supplemental tools such as bootstrap, estimation of data moments, and a test of zero correlation between two variables with a joint elliptical distribution are also provided. The package offers a convenient set of tools for imposing requirements on input data and displaying output. Integrated in the BaBar computing environment, the package maintains a minimal set of external dependencies and therefore can be easily adapted to any other environment. It has been tested on many idealistic and realistic examples.
We present a procedure for reconstructing particle cascades from event data measured in a high energy physics experiment. For evaluating the hypothesis of a specific physics process causing the observed data, all possible reconstructi