ترغب بنشر مسار تعليمي؟ اضغط هنا

Relevant statistics for Bayesian model choice

469   0   0.0 ( 0 )
 نشر من قبل Christian P. Robert
 تاريخ النشر 2011
  مجال البحث علم الأحياء
والبحث باللغة English




اسأل ChatGPT حول البحث

The choice of the summary statistics used in Bayesian inference and in particular in ABC algorithms has bearings on the validation of the resulting inference. Those statistics are nonetheless customarily used in ABC algorithms without consistency checks. We derive necessary and sufficient conditions on summary statistics for the corresponding Bayes factor to be convergent, namely to asymptotically select the true model. Those conditions, which amount to the expectations of the summary statistics to asymptotically differ under both models, are quite natural and can be exploited in ABC settings to infer whether or not a choice of summary statistics is appropriate, via a Monte Carlo validation.


قيم البحث

اقرأ أيضاً

Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques. We propose a novel approach based on a machine learning tool named random forests to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with random forests and postponing the approximation of the posterior probability of the predicted MAP for a second stage also relying on random forests. Compared with earlier implementations of ABC model choice, the ABC random forest approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least fifty), and (iv) it includes an approximation of the posterior probability of the selected model. The call to random forests will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets. The proposed methodologies are implemented in the R package abcrf available on the CRAN.
Much is now known about the consistency of Bayesian updating on infinite-dimensional parameter spaces with independent or Markovian data. Necessary conditions for consistency include the prior putting enough weight on the correct neighborhoods of the data-generating distribution; various sufficient conditions further restrict the prior in ways analogous to capacity control in frequentist nonparametrics. The asymptotics of Bayesian updating with mis-specified models or priors, or non-Markovian data, are far less well explored. Here I establish sufficient conditions for posterior convergence when all hypotheses are wrong, and the data have complex dependencies. The main dynamical assumption is the asymptotic equipartition (Shannon-McMillan-Breiman) property of information theory. This, along with Egorovs Theorem on uniform convergence, lets me build a sieve-like structure for the prior. The main statistical assumption, also a form of capacity control, concerns the compatibility of the prior and the data-generating process, controlling the fluctuations in the log-likelihood when averaged over the sieve-like sets. In addition to posterior convergence, I derive a kind of large deviations principle for the posterior measure, extending in some cases to rates of convergence, and discuss the advantages of predicting using a combination of models known to be wrong. An appendix sketches connections between these results and the replicator dynamics of evolutionary theory.
Many areas of agriculture rely on honey bees to provide pollination services and any decline in honey bee numbers can impact on global food security. In order to understand the dynamics of honey bee colonies we present a discrete time marked renewal process model for the size of a colony. We demonstrate that under mild conditions this attains a stationary distribution that depends on the distribution of the numbers of eggs per batch, the probability an egg hatches and the distributions of the times between batches and bee lifetime. This allows an analytic examination of the effect of changing these quantities. We then extend this model to cyclic annual effects where for example the numbers of eggs per batch and {the probability an egg hatches} may vary over the year.
Modern genomic studies are increasingly focused on discovering more and more interesting genes associated with a health response. Traditional shrinkage priors are primarily designed to detect a handful of signals from tens and thousands of predictors . Under diverse sparsity regimes, the nature of signal detection is associated with a tail behaviour of a prior. A desirable tail behaviour is called tail-adaptive shrinkage property where tail-heaviness of a prior gets adaptively larger (or smaller) as a sparsity level increases (or decreases) to accommodate more (or less) signals. We propose a global-local-tail (GLT) Gaussian mixture distribution to ensure this property and provide accurate inference under diverse sparsity regimes. Incorporating a peaks-over-threshold method in extreme value theory, we develop an automated tail learning algorithm for the GLT prior. We compare the performance of the GLT prior to the Horseshoe in two gene expression datasets and numerical examples. Results suggest that varying tail rule is advantageous over fixed tail rule under diverse sparsity domains.
106 - Veronique Letort 2010
Background and Aims: Prediction of phenotypic traits from new genotypes under untested environmental conditions is crucial to build simulations of breeding strategies to improve target traits. Although the plant response to environmental stresses is characterized by both architectural and functional plasticity, recent attempts to integrate biological knowledge into genetics models have mainly concerned specific physiological processes or crop models without architecture, and thus may prove limited when studying genotype x environment interactions. Consequently, this paper presents a simulation study introducing genetics into a functional-structural growth model, which gives access to more fundamental traits for quantitative trait loci (QTL) detection and thus to promising tools for yield optimization. Methods: The GreenLab model was selected as a reasonable choice to link growth model parameters to QTL. Virtual genes and virtual chromosomes were defined to build a simple genetic model that drove the settings of the species-specific parameters of the model. The QTL Cartographer software was used to study QTL detection of simulated plant traits. A genetic algorithm was implemented to define the ideotype for yield maximization based on the model parameters and the associated allelic combination. Key Results and Conclusions: By keeping the environmental factors constant and using a virtual population with a large number of individuals generated by a Mendelian genetic model, results for an ideal case could be simulated. Virtual QTL detection was compared in the case of phenotypic traits - such as cob weight - and when traits were model parameters, and was found to be more accurate in the latter case. The practical interest of this approach is illustrated by calculating the parameters (and the corresponding genotype) associated with yield optimization of a GreenLab maize model. The paper discusses the potentials of GreenLab to represent environment x genotype interactions, in particular through its main state variable, the ratio of biomass supply over demand.
التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا