ترغب بنشر مسار تعليمي؟ اضغط هنا

On the Estimation of Confidence Intervals for Binomial Population Proportions in Astronomy: The Simplicity and Superiority of the Bayesian Approach

101   0   0.0 ( 0 )
 نشر من قبل Ewan Cameron Dr
 تاريخ النشر 2010
  مجال البحث فيزياء
والبحث باللغة English
 تأليف Ewan Cameron




اسأل ChatGPT حول البحث

I present a critical review of techniques for estimating confidence intervals on binomial population proportions inferred from success counts in small-to-intermediate samples. Population proportions arise frequently as quantities of interest in astronomical research; for instance, in studies aiming to constrain the bar fraction, AGN fraction, SMBH fraction, merger fraction, or red sequence fraction from counts of galaxies exhibiting distinct morphological features or stellar populations. However, two of the most widely-used techniques for estimating binomial confidence intervals--the normal approximation and the Clopper & Pearson approach--are liable to misrepresent the degree of statistical uncertainty present under sampling conditions routinely encountered in astronomical surveys, leading to an ineffective use of the experimental data (and, worse, an inefficient use of the resources expended in obtaining that data). Hence, I provide here an overview of the fundamentals of binomial statistics with two principal aims: (i) to reveal the ease with which (Bayesian) binomial confidence intervals with more satisfactory behaviour may be estimated from the quantiles of the beta distribution using modern mathematical software packages (e.g. R, matlab, mathematica, IDL, python); and (ii) to demonstrate convincingly the major flaws of both the normal approximation and the Clopper & Pearson approach for error estimation.



قيم البحث

اقرأ أيضاً

As first suggested by U. Fano in the 1940s, the statistical fluctuation of the number of pairs produced in an ionizing interaction is known to be sub-Poissonian. The dispersion is reduced by the so-called Fano factor, which empirically encapsulates t he correlations in the process of ionization. In modelling the energy response of an ionization measurement device, the effect of the Fano factor is commonly folded into the overall energy resolution. While such an approximate treatment is appropriate when a significant number of ionization pairs are expected to be produced, the Fano factor needs to be accounted for directly at the level of pair creation when only a few are expected. To do so, one needs a discrete probability distribution of the number of pairs created $N$ with independent control of both the expectation $mu$ and Fano factor $F$. Although no distribution $P(N|mu,F)$ with this convenient form exists, we propose the use of the COM-Poisson distribution together with strategies for utilizing it to effectively fulfill this need. We then use this distribution to assess the impact that the Fano factor may have on the sensitivity of low-mass WIMP search experiments.
We apply an empirical, data-driven approach for describing crop yield as a function of monthly temperature and precipitation by employing generative probabilistic models with parameters determined through Bayesian inference. Our approach is applied t o state-scale maize yield and meteorological data for the US Corn Belt from 1981 to 2014 as an exemplar, but would be readily transferable to other crops, locations and spatial scales. Experimentation with a number of models shows that maize growth rates can be characterised by a two-dimensional Gaussian function of temperature and precipitation with monthly contributions accumulated over the growing period. This approach accounts for non-linear growth responses to the individual meteorological variables, and allows for interactions between them. Our models correctly identify that temperature and precipitation have the largest impact on yield in the six months prior to the harvest, in agreement with the typical growing season for US maize (April to September). Maximal growth rates occur for monthly mean temperature 18-19$^circ$C, corresponding to a daily maximum temperature of 24-25$^circ$C (in broad agreement with previous work) and monthly total precipitation 115 mm. Our approach also provides a self-consistent way of investigating climate change impacts on current US maize varieties in the absence of adaptation measures. Keeping precipitation and growing area fixed, a temperature increase of $2^circ$C, relative to 1981-2014, results in the mean yield decreasing by 8%, while the yield variance increases by a factor of around 3. We thus provide a flexible, data-driven framework for exploring the impacts of natural climate variability and climate change on globally significant crops based on their observed behaviour. In concert with other approaches, this can help inform the development of adaptation strategies that will ensure food security under a changing climate.
109 - C. R. Jenkins 2011
We discuss the use of the Bayesian evidence ratio, or Bayes factor, for model selection in astronomy. We treat the evidence ratio as a statistic and investigate its distribution over an ensemble of experiments, considering both simple analytical exam ples and some more realistic cases, which require numerical simulation. We find that the evidence ratio is a noisy statistic, and thus it may not be sensible to decide to accept or reject a model based solely on whether the evidence ratio reaches some threshold value. The odds suggested by the evidence ratio bear no obvious relationship to the power or Type I error rate of a test based on the evidence ratio. The general performance of such tests is strongly affected by the signal to noise ratio in the data, the assumed priors, and the threshold in the evidence ratio that is taken as `decisive. The comprehensiveness of the model suite under consideration is also very important. The usefulness of the evidence ratio approach in a given problem can be assessed in advance of the experiment, using simple models and numerical approximations. In many cases, this approach can be as informative as a much more costly full-scale Bayesian analysis of a complex problem.
The maximum entropy principle can be used to assign utility values when only partial information is available about the decision makers preferences. In order to obtain such utility values it is necessary to establish an analogy between probability an d utility through the notion of a utility density function. According to some authors [Soofi (1990), Abbas (2006a) (2006b), Sandow et al. (2006), Friedman and Sandow (2006), Darooneh (2006)] the maximum entropy utility solution embeds a large family of utility functions. In this paper we explore the maximum entropy principle to estimate the utility function of a risk averse decision maker.
After the discovery of the gravitational waves and the observation of neutrinos of cosmic origin, we have entered a new and exciting era where cosmic rays, neutrinos, photons and gravitational waves will be used simultaneously to study the highest en ergy phenomena in the Universe. Here we present a fully Bayesian approach to the challenge of combining and comparing the wealth of measurements from existing and upcoming experimental facilities. We discuss the procedure from a theoretical point of view and using simulations, we also demonstrate the feasibility of the method by incorporating the use of information provided by different theoretical models and different experimental measurements.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا