New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Dynamics of Bayesian Updating with Dependent Data and Misspecified Models

496 0 0.0 ( 0 )

Download Cite

Added by Cosma Rohilla Shalizi

Publication date 2009

fields Biology

and research's language is English

Authors Cosma Rohilla Shalizi

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

Much is now known about the consistency of Bayesian updating on infinite-dimensional parameter spaces with independent or Markovian data. Necessary conditions for consistency include the prior putting enough weight on the correct neighborhoods of the data-generating distribution; various sufficient conditions further restrict the prior in ways analogous to capacity control in frequentist nonparametrics. The asymptotics of Bayesian updating with mis-specified models or priors, or non-Markovian data, are far less well explored. Here I establish sufficient conditions for posterior convergence when all hypotheses are wrong, and the data have complex dependencies. The main dynamical assumption is the asymptotic equipartition (Shannon-McMillan-Breiman) property of information theory. This, along with Egorovs Theorem on uniform convergence, lets me build a sieve-like structure for the prior. The main statistical assumption, also a form of capacity control, concerns the compatibility of the prior and the data-generating process, controlling the fluctuations in the log-likelihood when averaged over the sieve-like sets. In addition to posterior convergence, I derive a kind of large deviations principle for the posterior measure, extending in some cases to rates of convergence, and discuss the advantages of predicting using a combination of models known to be wrong. An appendix sketches connections between these results and the replicator dynamics of evolutionary theory.

rate research

Combining Survival Trials Using Aggregate Data Based on Misspecified Models

147 - Tinghui Yu 2015

The treatment effects of the same therapy observed from multiple clinical trials can often be very different. Yet the patient characteristics accounting for these differences may not be identifiable in real world practice. There needs to be an unbiased way to combine the results from multiple trials and report the overall treatment effect for the general population during the development and validation of a new therapy. The non-linear structure of the maximum partial likelihood estimates for the (log) hazard ratio defined with a Cox proportional hazard model leads to challenges in the statistical analyses for combining such clinical trials. In this paper, we formulated the expected overall treatment effects using various modeling assumptions. Thus we are proposing efficient estimates and a version of Wald test for the combined hazard ratio using only aggregate data. Interpretation of the methods are provided in the framework of robust data analyses involving misspecified models.

Statistics Theory Methodology Statistics Theory

Quantitative Genetics and Functional-Structural Plant Growth Models: Simulation of Quantitative Trait Loci Detection for Model Parameters and Application to Potential Yield Optimization

132 - Veronique Letort 2010

Background and Aims: Prediction of phenotypic traits from new genotypes under untested environmental conditions is crucial to build simulations of breeding strategies to improve target traits. Although the plant response to environmental stresses is characterized by both architectural and functional plasticity, recent attempts to integrate biological knowledge into genetics models have mainly concerned specific physiological processes or crop models without architecture, and thus may prove limited when studying genotype x environment interactions. Consequently, this paper presents a simulation study introducing genetics into a functional-structural growth model, which gives access to more fundamental traits for quantitative trait loci (QTL) detection and thus to promising tools for yield optimization. Methods: The GreenLab model was selected as a reasonable choice to link growth model parameters to QTL. Virtual genes and virtual chromosomes were defined to build a simple genetic model that drove the settings of the species-specific parameters of the model. The QTL Cartographer software was used to study QTL detection of simulated plant traits. A genetic algorithm was implemented to define the ideotype for yield maximization based on the model parameters and the associated allelic combination. Key Results and Conclusions: By keeping the environmental factors constant and using a virtual population with a large number of individuals generated by a Mendelian genetic model, results for an ideal case could be simulated. Virtual QTL detection was compared in the case of phenotypic traits - such as cob weight - and when traits were model parameters, and was found to be more accurate in the latter case. The practical interest of this approach is illustrated by calculating the parameters (and the corresponding genotype) associated with yield optimization of a GreenLab maize model. The paper discusses the potentials of GreenLab to represent environment x genotype interactions, in particular through its main state variable, the ratio of biomass supply over demand.

Statistics Theory Populations and Evolution Statistics Theory

Identifiability of a Markovian model of molecular evolution with Gamma-distributed rates

111 - Elizabeth S. Allman , Cecile Ane , John A. Rhodes 2008

Inference of evolutionary trees and rates from biological sequences is commonly performed using continuous-time Markov models of character change. The Markov process evolves along an unknown tree while observations arise only from the tips of the tree. Rate heterogeneity is present in most real data sets and is accounted for by the use of flexible mixture models where each site is allowed its own rate. Very little has been rigorously established concerning the identifiability of the models currently in common use in data analysis, although non-identifiability was proven for a semi-parametric model and an incorrect proof of identifiability was published for a general parametric model (GTR+Gamma+I). Here we prove that one of the most widely used models (GTR+Gamma) is identifiable for generic parameters, and for all parameter choices in the case of 4-state (DNA) models. This is the first proof of identifiability of a phylogenetic model with a continuous distribution of rates.

Statistics Theory Populations and Evolution Statistics Theory

Bayesian Updating Rules in Continuous Opinion Dynamics Models

108 - Andre C. R. Martins 2008

In this article, I investigate the use of Bayesian updating rules applied to modeling social agents in the case of continuos opinions models. Given another agent statement about the continuous value of a variable $x$, we will see that interesting dynamics emerge when an agent assigns a likelihood to that value that is a mixture of a Gaussian and a Uniform distribution. This represents the idea the other agent might have no idea about what he is talking about. The effect of updating only the first moments of the distribution will be studied. and we will see that this generates results similar to those of the Bounded Confidence models. By also updating the second moment, several different opinions always survive in the long run. However, depending on the probability of error and initial uncertainty, those opinions might be clustered around a central value.

Physics and Society

Ancestral inference from haplotypes and mutations

311 - Robert C. Griffiths , Simon Tavare 2017

We consider inference about the history of a sample of DNA sequences, conditional upon the haplotype counts and the number of segregating sites observed at the present time. After deriving some theoretical results in the coalescent setting, we implement rejection sampling and importance sampling schemes to perform the inference. The importance sampling scheme addresses an extension of the Ewens Sampling Formula for a configuration of haplotypes and the number of segregating sites in the sample. The implementations include both constant and variable population size models. The methods are illustrated by two human Y chromosome data sets.

Statistics Theory Populations and Evolution Statistics Theory

comments

Fetching comments

Higher Institute for Applied Sciences and Technology

Additional details More universities

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Dynamics of Bayesian Updating with Dependent Data and Misspecified Models

Ask ChatGPT about the research

No Arabic abstract

Read More