No Arabic abstract
When the data do not conform to the hypothesis of a known sampling-variance, the fitting of a constant to a set of measured values is a long debated problem. Given the data, fitting would require to find what measurand value is the most trustworthy. Bayesian inference is here reviewed, to assign probabilities to the possible measurand values. Different hypothesis about the data variance are tested by Bayesian model comparison. Eventually, model selection is exemplified in deriving an estimate of the Planck constant.
We discuss the problem of extending data mining approaches to cases in which data points arise in the form of individual graphs. Being able to find the intrinsic low-dimensionality in ensembles of graphs can be useful in a variety of modeling contexts, especially when coarse-graining the detailed graph information is of interest. One of the main challenges in mining graph data is the definition of a suitable pairwise similarity metric in the space of graphs. We explore two practical solutions to solving this problem: one based on finding subgraph densities, and one using spectral information. The approach is illustrated on three test data sets (ensembles of graphs); two of these are obtained from standard graph generating algorithms, while the graphs in the third example are sampled as dynamic snapshots from an evolving network simulation.
The Collaborative Analysis Versioning Environment System (CAVES) project concentrates on the interactions between users performing data and/or computing intensive analyses on large data sets, as encountered in many contemporary scientific disciplines. In modern science increasingly larger groups of researchers collaborate on a given topic over extended periods of time. The logging and sharing of knowledge about how analyses are performed or how results are obtained is important throughout the lifetime of a project. Here is where virtual data concepts play a major role. The ability to seamlessly log, exchange and reproduce results and the methods, algorithms and computer programs used in obtaining them enhances in a qualitative way the level of collaboration in a group or between groups in larger organizations. The CAVES project takes a pragmatic approach in assessing the needs of a community of scientists by building series of prototypes with increasing sophistication. In extending the functionality of existing data analysis packages with virtual data capabilities these prototypes provide an easy and habitual entry point for researchers to explore virtual data concepts in real life applications and to provide valuable feedback for refining the system design. The architecture is modular based on Web, Grid and other services which can be plugged in as desired. As a proof of principle we build a first system by extending the very popular data analysis framework ROOT, widely used in high energy physics and other fields, making it virtual data enabled.
We review the methods to combine several measurements, in the form of parameter values or $p$-values.
A new data analysis method is developed for the angle resolving silicon telescope introduced at the neutron time of flight facility n_TOF at CERN. The telescope has already been used in measurements of several neutron induced reactions with charged particles in the exit channel. The development of a highly detailed method is necessitated by the latest joint measurement of the $^{12}$C($n,p$) and $^{12}$C($n,d$) reactions from n_TOF. The reliable analysis of these data must account for the challenging nature of the involved reactions, as they are affected by the multiple excited states in the daughter nuclei and characterized by the anisotropic angular distributions of the reaction products. The unabridged analysis procedure aims at the separate reconstruction of all relevant reaction parameters - the absolute cross section, the branching ratios and the angular distributions - from the integral number of the coincidental counts detected by the separate pairs of silicon strips. This procedure is tested under the specific conditions relevant for the $^{12}$C($n,p$) and $^{12}$C($n,d$) measurements from n_TOF, in order to assess its direct applicability to these experimental data. Based on the reached conclusions, the original method is adapted to a particular level of uncertainties in the input data.
The location-scale model is usually present in physics and chemistry in connection to the Birge ratio method for the adjustment of fundamental physical constants such as the Planck constant or the Newtonian constant of gravitation, while the random effects model is the commonly used approach for meta-analysis in medicine. These two competitive models are used to increase the quoted uncertainties of the measurement results to make them consistent. The intrinsic Bayes factor (IBF) is derived for the comparison of the random effects model to the location-scale model, and we answer the question which model performs better for the determination of the Newtonian constant of gravitation. The results of the empirical illustration support the application of the Birge ratio method which is currently used in the adjustment of the CODATA 2018 value for the Newtonian constant of gravitation together with its uncertainty. The results of the simulation study illustrate that the suggested procedure for model selection is decisive even when data consist of a few measurement results.