No Arabic abstract
In this work we test the most widely used methods for fitting the composition fraction in data, namely maximum likelihood, $chi^2$, mean value of the distributions and mean value of the posterior probability function. We discuss the discrimination power of the four methods in different scenarios: signal to noise discrimination; two signals; and distributions of Xmax for mixed primary mass composition. We introduce a distance parameter, which can be used to estimate, as a rule of thumb, the precision of the discrimination. Finally, we conclude that the most reliable methods in all the studied scenarios are the maximum likelihood and the mean value of the posterior probability function.
Transient and variable phenomena in astrophysical sources are of particular importance to understand the underlying gamma-ray emission processes. In the very-high energy gamma-ray domain, transient and variable sources are related to charged particle acceleration processes that could for instance help understanding the origin of cosmic-rays. The imaging atmospheric Cherenkov technique used for gamma-ray astronomy above $sim 100$ GeV is well suited for detecting such events. However, the standard analysis methods are not optimal for such a goal and more sensitive methods are specifically developed in this publication. The sensitivity improvement could therefore be helpful to detect brief and faint transient sources such as Gamma-Ray Bursts.
Exoplanet research is carried out at the limits of the capabilities of current telescopes and instruments. The studied signals are weak, and often embedded in complex systematics from instrumental, telluric, and astrophysical sources. Combining repeated observations of periodic events, simultaneous observations with multiple telescopes, different observation techniques, and existing information from theory and prior research can help to disentangle the systematics from the planetary signals, and offers synergistic advantages over analysing observations separately. Bayesian inference provides a self-consistent statistical framework that addresses both the necessity for complex systematics models, and the need to combine prior information and heterogeneous observations. This chapter offers a brief introduction to Bayesian inference in the context of exoplanet research, with focus on time series analysis, and finishes with an overview of a set of freely available programming libraries.
In some cases, computational benefit can be gained by exploring the hyper parameter space using a deterministic set of grid points instead of a Markov chain. We view this as a numerical integration problem and make three unique contributions. First, we explore the space using low discrepancy point sets instead of a grid. This allows for accurate estimation of marginals of any shape at a much lower computational cost than a grid based approach and thus makes it possible to extend the computational benefit to a hyper parameter space with higher dimensionality (10 or more). Second, we propose a new, quick and easy method to estimate the marginal using a least squares polynomial and prove the conditions under which this polynomial will converge to the true marginal. Our results are valid for a wide range of point sets including grids, random points and low discrepancy points. Third, we show that further accuracy and efficiency can be gained by taking into consideration the functional decomposition of the integrand and illustrate how this can be done using anchored f-ANOVA on weighted spaces.
This paper describes the Bayesian Technique for Multi-image Analysis (BaTMAn), a novel image-segmentation technique based on Bayesian statistics that characterizes any astronomical dataset containing spatial information and performs a tessellation based on the measurements and errors provided as input. The algorithm iteratively merges spatial elements as long as they are statistically consistent with carrying the same information (i.e. identical signal within the errors). We illustrate its operation and performance with a set of test cases including both synthetic and real Integral-Field Spectroscopic data. The output segmentations adapt to the underlying spatial structure, regardless of its morphology and/or the statistical properties of the noise. The quality of the recovered signal represents an improvement with respect to the input, especially in regions with low signal-to-noise ratio. However, the algorithm may be sensitive to small-scale random fluctuations, and its performance in presence of spatial gradients is limited. Due to these effects, errors may be underestimated by as much as a factor of two. Our analysis reveals that the algorithm prioritizes conservation of all the statistically-significant information over noise reduction, and that the precise choice of the input data has a crucial impact on the results. Hence, the philosophy of BaTMAn is not to be used as a `black box to improve the signal-to-noise ratio, but as a new approach to characterize spatially-resolved data prior to its analysis. The source code is publicly available at http://astro.ft.uam.es/SELGIFS/BaTMAn .
We apply the Bayesian framework to assess the presence of a correlation between two quantities. To do so, we estimate the probability distribution of the parameter of interest, $rho$, characterizing the strength of the correlation. We provide an implementation of these ideas and concepts using python programming language and the pyMC module in a very short ($sim$130 lines of code, heavily commented) and user-friendly program. We used this tool to assess the presence and properties of the correlation between planetary surface gravity and stellar activity level as measured by the log($R_{mathrm{HK}}$) indicator. The results of the Bayesian analysis are qualitatively similar to those obtained via p-value analysis, and support the presence of a correlation in the data. The results are more robust in their derivation and more informative, revealing interesting features such as asymmetric posterior distributions or markedly different credible intervals, and allowing for a deeper exploration. We encourage the reader interested in this kind of problem to apply our code to his/her own scientific problems. The full understanding of what the Bayesian framework is can only be gained through the insight that comes by handling priors, assessing the convergence of Monte Carlo runs, and a multitude of other practical problems. We hope to contribute so that Bayesian analysis becomes a tool in the toolkit of researchers, and they understand by experience its advantages and limitations.