BIC extensions for order-constrained model selection

374 0 0.0 ( 0 )

Download Cite

Added by Joris Mulder

Publication date 2018

fields Mathematical Statistics

and research's language is English

Authors Joris Mulder - Adrian E. Raftery

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

The Schwarz or Bayesian information criterion (BIC) is one of the most widely used tools for model comparison in social science research. The BIC however is not suitable for evaluating models with order constraints on the parameters of interest. This paper explores two extensions of the BIC for evaluating order constrained models, one where a truncated unit information prior is used under the order-constrained model, and the other where a truncated local unit information prior is used. The first prior is centered around the maximum likelihood estimate and the latter prior is centered around a null value. Several analyses show that the order-constrained BIC based on the local unit information prior works better as an Occams razor for evaluating order-constrained models and results in lower error probabilities. The methodology based on the local unit information prior is implemented in the R package `BICpack which allows researchers to easily apply the method for order-constrained model selection. The usefulness of the methodology is illustrated using data from the European Values Study.

rate research

Model Selection for Mixture Models - Perspectives and Strategies

96 - Gilles Celeux , 2018

Determining the number G of components in a finite mixture distribution is an important and difficult inference issue. This is a most important question, because statistical inference about the resulting model is highly sensitive to the value of G. Selecting an erroneous value of G may produce a poor density estimate. This is also a most difficult question from a theoretical perspective as it relates to unidentifiability issues of the mixture model. This is further a most relevant question from a practical viewpoint since the meaning of the number of components G is strongly related to the modelling purpose of a mixture distribution. We distinguish in this chapter between selecting G as a density estimation problem in Section 2 and selecting G in a model-based clustering framework in Section 3. Both sections discuss frequentist as well as Bayesian approaches. We present here some of the Bayesian solutions to the different interpretations of picking the right number of components in a mixture, before concluding on the ill-posed nature of the question.

Methodology Computation

Bayesian Variable Selection for Single Index Logistic Model

211 - Yinrui Sun , Hangjin Jiang 2020

In the era of big data, variable selection is a key technology for handling high-dimensional problems with a small sample size but a large number of covariables. Different variable selection methods were proposed for different models, such as linear model, logistic model and generalized linear model. However, fewer works focused on variable selection for single index models, especially, for single index logistic model, due to the difficulty arose from the unknown link function and the slow mixing rate of MCMC algorithm for traditional logistic model. In this paper, we proposed a Bayesian variable selection procedure for single index logistic model by taking the advantage of Gaussian process and data augmentation. Numerical results from simulations and real data analysis show the advantage of our method over the state of arts.

Methodology

Bayesian Evidence and Model Selection

308 - Kevin H. Knuth , Michael Habeck , Nabin K. Malakar 2014

In this paper we review the concepts of Bayesian evidence and Bayes factors, also known as log odds ratios, and their application to model selection. The theory is presented along with a discussion of analytic, approximate and numerical techniques. Specific attention is paid to the Laplace approximation, variational Bayes, importance sampling, thermodynamic integration, and nested sampling and its recent variants. Analogies to statistical physics, from which many of these techniques originate, are discussed in order to provide readers with deeper insights that may lead to new techniques. The utility of Bayesian model testing in the domain sciences is demonstrated by presenting four specific practical examples considered within the context of signal processing in the areas of signal detection, sensor characterization, scientific model selection and molecular force characterization.

Methodology Instrumentation and Methods for Astrophysics Applications

On model selection criteria for climate change impact studies

88 - Xiaomeng Cui , Dalia Ghanem , Todd Kuffner 2018

Climate change impact studies inform policymakers on the estimated damages of future climate change on economic, health and other outcomes. In most studies, an annual outcome variable is observed, e.g. agricultural yield, annual mortality or gross domestic product, along with a higher-frequency regressor, e.g. daily temperature. While applied researchers tend to consider multiple models to characterize the relationship between the outcome and the high-frequency regressor, to inform policy a choice between the damage functions implied by the different models has to be made. This paper formalizes the model selection problem in this empirical setting and provides conditions for the consistency of Monte Carlo Cross-validation and generalized information criteria. A simulation study illustrates the theoretical results and points to the relevance of the signal-to-noise ratio for the finite-sample behavior of the model selection criteria. Two empirical applications with starkly different signal-to-noise ratios illustrate the practical implications of the formal analysis on model selection criteria provided in this paper.

Methodology Applications

A generalized EMS algorithm for model selection with incomplete data

327 - Ping-Feng Xu , Lai-Xu Shang , Man-Lai Tang 2021

Recently, a so-called E-MS algorithm was developed for model selection in the presence of missing data. Specifically, it performs the Expectation step (E step) and Model Selection step (MS step) alternately to find the minimum point of the observed generalized information criteria (GIC). In practice, it could be numerically infeasible to perform the MS-step for high dimensional settings. In this paper, we propose a more simple and feasible generalized EMS (GEMS) algorithm which simply requires a decrease in the observed GIC in the MS-step and includes the original EMS algorithm as a special case. We obtain several numerical convergence results of the GEMS algorithm under mild conditions. We apply the proposed GEMS algorithm to Gaussian graphical model selection and variable selection in generalized linear models and compare it with existing competitors via numerical experiments. We illustrate its application with three real data sets.

Methodology