No Arabic abstract
The experimental issue of the search for new particles of unknown mass poses the challenge of exploring a wide interval to look for the usual signatures represented by excess of events above the background. A side effect of such a broad range quest is that the traditional significance calculations valid for signals of known location are no more applicable when such an information is missing. In this note the specific signal search approach via observation windows sliding over the range of interest is considered; in the assumptions of known background and of fixed width of the exploring windows the statistical implications of such a search scheme are described, with special emphasis on the correct significance assessment for a claimed discovery.
An algorithm for optimization of signal significance or any other classification figure of merit suited for analysis of high energy physics (HEP) data is described. This algorithm trains decision trees on many bootstrap replicas of training data with each tree required to optimize the signal significance or any other chosen figure of merit. New data are then classified by a simple majority vote of the built trees. The performance of this algorithm has been studied using a search for the radiative leptonic decay B->gamma l nu at BaBar and shown to be superior to that of all other attempted classifiers including such powerful methods as boosted decision trees. In the B->gamma e nu channel, the described algorithm increases the expected signal significance from 2.4 sigma obtained by an original method designed for the B->gamma l nu analysis to 3.0 sigma.
We discuss the traditional criterion for discovery in Particle Physics of requiring a significance corresponding to at least 5 sigma; and whether a more nuanced approach might be better.
Resume: Le principal objet de cette communication est de faire une retro perspective succincte de lutilisation de lentropie et du principe du maximum dentropie dans le domaine du traitement du signal. Apr`es un bref rappel de quelques definitions et du principe du maximum dentropie, nous verrons successivement comment lentropie est utilisee en separation de sources, en modelisation de signaux, en analyse spectrale et pour la resolution des probl`emes inverses lineaires. Mots cles : Entropie, Entropie croisee, Distance de Kullback, Information mutuelle, Estimation spectrale, Probl`emes inverses Abstract: The main object of this work is to give a brief overview of the different ways the entropy has been used in signal and image processing. After a short introduction of different quantities related to the entropy and the maximum entropy principle, we will study their use in different fields of signal processing such as: source separation, model order selection, spectral estimation and, finally, general linear inverse problems. Keywords : Entropy, Relative entropy, Kullback distance, Mutual information, Spectral estimation, Inverse problems.
Identifying frequencies with low signal-to-noise ratios in time series of stellar photometry and spectroscopy, and measuring their amplitude ratios and peak widths accurately, are critical goals for asteroseismology. These are also challenges for time series with gaps or whose data are not sampled at a constant rate, even with modern Discrete Fourier Transform (DFT) software. Also the False-Alarm Probability introduced by Lomb and Scargle is an approximation which becomes less reliable in time series with longer data gaps. A rigorous statistical treatment of how to determine the significance of a peak in a DFT, called SigSpec, is presented here. SigSpec is based on an analytical solution of the probability that a DFT peak of a given amplitude does not arise from white noise in a non-equally spaced data set. The underlying Probability Density Function (PDF) of the amplitude spectrum generated by white noise can be derived explicitly if both frequency and phase are incorporated into the solution. In this paper, I define and evaluate an unbiased statistical estimator, the spectral significance, which depends on frequency, amplitude, and phase in the DFT, and which takes into account the time-domain sampling. I also compare this estimator to results from other well established techniques and demonstrate the effectiveness of SigSpec with a few examples of ground- and space-based photometric data, illustratring how SigSpec deals with the effects of noise and time-domain sampling in determining significant frequencies.
Several experiments in high-energy physics and astrophysics can be treated as on/off measurements, where an observation potentially containing a new source or effect (on measurement) is contrasted with a background-only observation free of the effect (off measurement). In counting experiments, the significance of the new source or effect can be estimated with a widely-used formula from [LiMa], which assumes that both measurements are Poisson random variables. In this paper we study three other cases: i) the ideal case where the background measurement has no uncertainty, which can be used to study the maximum sensitivity that an instrument can achieve, ii) the case where the background estimate $b$ in the off measurement has an additional systematic uncertainty, and iii) the case where $b$ is a Gaussian random variable instead of a Poisson random variable. The latter case applies when $b$ comes from a model fitted on archival or ancillary data, or from the interpolation of a function fitted on data surrounding the candidate new source/effect. Practitioners typically use in this case a formula which is only valid when $b$ is large and when its uncertainty is very small, while we derive a general formula that can be applied in all regimes. We also develop simple methods that can be used to assess how much an estimate of significance is sensitive to systematic uncertainties on the efficiency or on the background. Examples of applications include the detection of short Gamma-Ray Bursts and of new X-ray or $gamma$-ray sources.