No Arabic abstract
We examine the problem of construction of confidence intervals within the basic single-parameter, single-iteration variation of the method of quasi-optimal weights. Two kinds of distortions of such intervals due to insufficiently large samples are examined, both allowing an analytical investigation. First, a criterion is developed for validity of the assumption of asymptotic normality together with a recipe for the corresponding corrections. Second, a method is derived to take into account the systematic shift of the confidence interval due to the non-linearity of the theoretical mean of the weight as a function of the parameter to be estimated. A numerical example illustrates the two corrections.
In a previous article we developed an approach to the optimal (minimum variance, unbiased) statistical estimation technique for the equilibrium displacement of a damped, harmonic oscillator in the presence of thermal noise. Here, we expand that work to include the optimal estimation of several linear parameters from a continuous time series. We show that working in the basis of the thermal driving force both simplifies the calculations and provides additional insight to why various approximate (not optimal) estimation techniques perform as they do. To illustrate this point, we compare the variance in the optimal estimator that we derive for thermal noise with those of two approximate methods which, like the optimal estimator, suppress the contribution to the variance that would come from the irrelevant, resonant motion of the oscillator. We discuss how these methods fare when the dominant noise process is either white displacement noise or noise with power spectral density that is inversely proportional to the frequency ($1/f$ noise). We also construct, in the basis of the driving force, an estimator that performs well for a mixture of white noise and thermal noise. To find the optimal multi-parameter estimators for thermal noise, we derive and illustrate a generalization of traditional matrix methods for parameter estimation that can accommodate continuous data. We discuss how this approach may help refine the design of experiments as they allow an exact, quantitative comparison of the precision of estimated parameters under various data acquisition and data analysis strategies.
Variants of fluctuation theorems recently discovered in the statistical mechanics of non-equilibrium processes may be used for the efficient determination of high-dimensional integrals as typically occurring in Bayesian data analysis. In particular for multimodal distributions, Monte-Carlo procedures not relying on perfect equilibration are advantageous. We provide a comprehensive statistical error analysis for the determination of the prior-predictive value in a Bayes problem building on a variant of the Jarzynski equation. Special care is devoted to the characterization of the bias intrinsic to the method. We also discuss the determination of averages over multimodal posterior distributions with the help of a variant of the Crooks theorem. All our findings are verified by extensive numerical simulations of two model systems with bimodal likelihoods.
Probability Density Estimation (PDE) is a multivariate discrimination technique based on sampling signal and background densities defined by event samples from data or Monte-Carlo (MC) simulations in a multi-dimensional phase space. In this paper, we present a modification of the PDE method that uses a self-adapting binning method to divide the multi-dimensional phase space in a finite number of hyper-rectangles (cells). The binning algorithm adjusts the size and position of a predefined number of cells inside the multi-dimensional phase space, minimising the variance of the signal and background densities inside the cells. The implementation of the binning algorithm PDE-Foam is based on the MC event-generation package Foam. We present performance results for representative examples (toy models) and discuss the dependence of the obtained results on the choice of parameters. The new PDE-Foam shows improved classification capability for small training samples and reduced classification time compared to the original PDE method based on range searching.
The most accurate method to combine measurement from different experiments is to build a combined likelihood function and use it to perform the desired inference. This is not always possible for various reasons, hence approximate methods are often convenient. Among those, the best linear unbiased estimator (BLUE) is the most popular, allowing to take into account individual uncertainties and their correlations. The method is unbiased by construction if the true uncertainties and their correlations are known, but it may exhibit a bias if uncertainty estimates are used in place of the true ones, in particular if those estimated uncertainties depend on measured values. In those cases, an iterative application of the BLUE method may reduce the bias of the combined measurement.
In a statistical analysis in Particle Physics, nuisance parameters can be introduced to take into account various types of systematic uncertainties. The best estimate of such a parameter is often modeled as a Gaussian distributed variable with a given standard deviation (the corresponding systematic error). Although the assigned systematic errors are usually treated as constants, in general they are themselves uncertain. A type of model is presented where the uncertainty in the assigned systematic errors is taken into account. Estimates of the systematic variances are modeled as gamma distributed random variables. The resulting confidence intervals show interesting and useful properties. For example, when averaging measurements to estimate their mean, the size of the confidence interval increases for decreasing goodness-of-fit, and averages have reduced sensitivity to outliers. The basic properties of the model are presented and several examples relevant for Particle Physics are explored.