No Arabic abstract
This paper deals with measuring the Bayesian robustness of classes of contaminated priors. Two different classes of priors in the neighborhood of the elicited prior are considered. The first one is the well-known $epsilon$-contaminated class, while the second one is the geometric mixing class. The proposed measure of robustness is based on computing the curvature of Renyi divergence between posterior distributions. Examples are used to illustrate the results by using simulated and real data sets.
Bayesian nonparametric statistics is an area of considerable research interest. While recently there has been an extensive concentration in developing Bayesian nonparametric procedures for model checking, the use of the Dirichlet process, in its simplest form, along with the Kullback-Leibler divergence is still an open problem. This is mainly attributed to the discreteness property of the Dirichlet process and that the Kullback-Leibler divergence between any discrete distribution and any continuous distribution is infinity. The approach proposed in this paper, which is based on incorporating the Dirichlet process, the Kullback-Leibler divergence and the relative belief ratio, is considered the first concrete solution to this issue. Applying the approach is simple and does not require obtaining a closed form of the relative belief ratio. A Monte Carlo study and real data examples show that the developed approach exhibits excellent performance.
Under the Bayesian brain hypothesis, behavioural variations can be attributed to different priors over generative model parameters. This provides a formal explanation for why individuals exhibit inconsistent behavioural preferences when confronted with similar choices. For example, greedy preferences are a consequence of confident (or precise) beliefs over certain outcomes. Here, we offer an alternative account of behavioural variability using Renyi divergences and their associated variational bounds. Renyi bounds are analogous to the variational free energy (or evidence lower bound) and can be derived under the same assumptions. Importantly, these bounds provide a formal way to establish behavioural differences through an $alpha$ parameter, given fixed priors. This rests on changes in $alpha$ that alter the bound (on a continuous scale), inducing different posterior estimates and consequent variations in behaviour. Thus, it looks as if individuals have different priors, and have reached different conclusions. More specifically, $alpha to 0^{+}$ optimisation leads to mass-covering variational estimates and increased variability in choice behaviour. Furthermore, $alpha to + infty$ optimisation leads to mass-seeking variational posteriors and greedy preferences. We exemplify this formulation through simulations of the multi-armed bandit task. We note that these $alpha$ parameterisations may be especially relevant, i.e., shape preferences, when the true posterior is not in the same family of distributions as the assumed (simpler) approximate density, which may be the case in many real-world scenarios. The ensuing departure from vanilla variational inference provides a potentially useful explanation for differences in behavioural preferences of biological (or artificial) agents under the assumption that the brain performs variational Bayesian inference.
A common concern with Bayesian methodology in scientific contexts is that inferences can be heavily influenced by subjective biases. As presented here, there are two types of bias for some quantity of interest: bias against and bias in favor. Based upon the principle of evidence, it is shown how to measure and control these biases for both hypothesis assessment and estimation problems. Optimality results are established for the principle of evidence as the basis of the approach to these problems. A close relationship is established between measuring bias in Bayesian inferences and frequentist properties that hold for any proper prior. This leads to a possible resolution to an apparent conflict between these approaches to statistical reasoning. Frequentism is seen as establishing a figure of merit for a statistical study, while Bayesianism plays the key role in determining inferences based upon statistical evidence.
The density power divergence (DPD) and related measures have produced many useful statistical procedures which provide a good balance between model efficiency on one hand, and outlier stability or robustness on the other. The large number of citations received by the original DPD paper (Basu et al., 1998) and its many demonstrated applications indicate the popularity of these divergences and the related methods of inference. The estimators that are derived from this family of divergences are all M-estimators where the defining $psi$ function is based explicitly on the form of the model density. The success of the minimum divergence estimators based on the density power divergence makes it imperative and meaningful to look for other, similar divergences in the same spirit. The logarithmic density power divergence (Jones et al., 2001), a logarithmic transform of the density power divergence, has also been very successful in producing inference procedures with a high degree of efficiency simultaneously with a high degree of robustness. This further strengthens the motivation to look for statistical divergences that are transforms of the density power divergence, or, alternatively, members of the functional density power divergence class. This note characterizes the functional density power divergence class, and thus identifies the available divergence measures within this construct that may possibly be explored for robust and efficient statistical inference.
Renyi divergence is related to Renyi entropy much like Kullback-Leibler divergence is related to Shannons entropy, and comes up in many settings. It was introduced by Renyi as a measure of information that satisfies almost the same axioms as Kullback-Leibler divergence, and depends on a parameter that is called its order. In particular, the Renyi divergence of order 1 equals the Kullback-Leibler divergence. We review and extend the most important properties of Renyi divergence and Kullback-Leibler divergence, including convexity, continuity, limits of $sigma$-algebras and the relation of the special order 0 to the Gaussian dichotomy and contiguity. We also show how to generalize the Pythagorean inequality to orders different from 1, and we extend the known equivalence between channel capacity and minimax redundancy to continuous channel inputs (for all orders) and present several other minimax results.