HistFitter software framework for statistical data analysis

365 0 0.0 ( 0 )

Download Cite

Added by Max Baak

Publication date 2014

fields

and research's language is English

Authors M. Baak - G.J. Besjes - D. Cote

High Energy Physics - Experiment

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We present a software framework for statistical data analysis, called HistFitter, that has been used extensively by the ATLAS Collaboration to analyze big datasets originating from proton-proton collisions at the Large Hadron Collider at CERN. Since 2012 HistFitter has been the standard statistical tool in searches for supersymmetric particles performed by ATLAS. HistFitter is a programmable and flexible framework to build, book-keep, fit, interpret and present results of data models of nearly arbitrary complexity. Starting from an object-oriented configuration, defined by users, the framework builds probability density functions that are automatically fitted to data and interpreted with statistical tests. A key innovation of HistFitter is its design, which is rooted in core analysis strategies of particle physics. The concepts of control, signal and validation regions are woven into its very fabric. These are progressively treated with statistically rigorous built-in methods. Being capable of working with multiple data models at once, HistFitter introduces an additional level of abstraction that allows for easy bookkeeping, manipulation and testing of large collections of signal hypotheses. Finally, HistFitter provides a collection of tools to present results with publication-quality style through a simple command-line interface.

rate research

GNA: new framework for statistical data analysis

65 - Anna Fatkina , Maxim Gonchar , Anastasia Kalitkina 2019

We report on the status of GNA --- a new framework for fitting large-scale physical models. GNA utilizes the data flow concept within which a model is represented by a directed acyclic graph. Each node is an operation on an array (matrix multiplication, derivative or cross section calculation, etc). The framework enables the user to create flexible and efficient large-scale lazily evaluated models, handle large numbers of parameters, propagate parameters uncertainties while taking into account possible correlations between them, fit models, and perform statistical analysis. The main goal of the paper is to give an overview of the main concepts and methods as well as reasons behind their design. Detailed technical information is to be published in further works.

Mathematical Software

Daisy: Data analysis integrated software system for X-ray experiments

93 - Yu Hu , Ling Li , Haolai Tian 2021

Daisy (Data Analysis Integrated Software System) has been designed for the analysis and visualization of the X-ray experiments. To address an extensive range of Chinese radiation facilities communitys requirements from purely algorithmic problems to scientific computing infrastructure, Daisy sets up a cloud-native platform to support on-site data analysis services with fast feedback and interaction. The plugs-in based application is convenient to process the expected high throughput data flow in parallel at next-generation facilities such as the High Energy Photon Source (HEPS). The objectives, functionality and architecture of Daisy are described in this article.

High Energy Physics - Experiment

J-PET Framework: Software platform for PET tomography data reconstruction and analysis

118 - Wojciech Krzemien , Aleksander Gajos , Krzysztof Kacprzak andn Kamil Rakoczy 2020

J-PET Framework is an open-source software platform for data analysis, written in C++ and based on the ROOT package. It provides a common environment for implementation of reconstruction, calibration and filtering procedures, as well as for user-level analyses of Positron Emission Tomography data. The library contains a set of building blocks that can be combined by users with even little programming experience, into chains of processing tasks through a convenient, simple and well-documented API. The generic input-output interface allows processing the data from various sources: low-level data from the tomography acquisition system or from diagnostic setups such as digital oscilloscopes, as well as high-level tomography structures e.g. sinograms or a list of lines-of-response. Moreover, the environment can be interfaced with Monte Carlo simulation packages such as GEANT and GATE, which are commonly used in the medical scientific community.

Instrumentation and Detectors Software Engineering Medical Physics

Performance analysis of the SO/PHI software framework for on-board data reduction

64 - K. Albert , J. Hirzberger , D. Busse 2019

The Polarimetric and Helioseismic Imager (PHI) is the first deep-space solar spectropolarimeter, on-board the Solar Orbiter (SO) space mission. It faces: stringent requirements on science data accuracy, a dynamic environment, and severe limitations on telemetry volume. SO/PHI overcomes these restrictions through on-board instrument calibration and science data reduction, using dedicated firmware in FPGAs. This contribution analyses the accuracy of a data processing pipeline by comparing the results obtained with SO/PHI hardware to a reference from a ground computer. The results show that for the analysed pipeline the error introduced by the firmware implementation is well below the requirements of SO/PHI.

Instrumentation and Methods for Astrophysics Solar and Stellar Astrophysics

ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization

314 - Ilka Antcheva , Maarten Ballintijn , Bertrand Bellenot 2015

ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, ROOT offers packages for complex data modeling and fitting, as well as multivariate classification based on machine learning techniques. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks - e.g. data mining in HEP - by using PROOF, which will take care of optimally distributing the work over the available resources in a transparent way.

Data Analysis Statistics and Probability Distributed Parallel and Cluster Computing