Model selection for sequential designs in discrete finite systems using Bernstein kernels

87 0 0.0 ( 0 )

Download Cite

Added by Madhurima Nath

Publication date 2018

fields Mathematical Statistics

and research's language is English

Authors Madhurima Nath - Stephen Eubank

Methodology

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

We view sequential design as a model selection problem to determine which new observation is expected to be the most informative, given the existing set of observations. For estimating a probability distribution on a bounded interval, we use bounds constructed from kernel density estimators along with the estimated density itself to estimate the information gain expected from each observation. We choose Bernstein polynomials for the kernel functions because they provide a complete set of basis functions for polynomials of finite degree and thus have useful convergence properties. We illustrate the method with applications to estimating network reliability polynomials, which give the probability of certain sets of configurations in finite, discrete stochastic systems.

rate research

Sequential online subsampling for thinning experimental designs

228 - Luc Pronzato , HaiYing Wang 2020

We consider a design problem where experimental conditions (design points $X_i$) are presented in the form of a sequence of i.i.d. random variables, generated with an unknown probability measure $mu$, and only a given proportion $alphain(0,1)$ can be selected. The objective is to select good candidates $X_i$ on the fly and maximize a concave function $Phi$ of the corresponding information matrix. The optimal solution corresponds to the construction of an optimal bounded design measure $xi_alpha^*leq mu/alpha$, with the difficulty that $mu$ is unknown and $xi_alpha^*$ must be constructed online. The construction proposed relies on the definition of a threshold $tau$ on the directional derivative of $Phi$ at the current information matrix, the value of $tau$ being fixed by a certain quantile of the distribution of this directional derivative. Combination with recursive quantile estimation yields a nonlinear two-time-scale stochastic approximation method. It can be applied to very long design sequences since only the current information matrix and estimated quantile need to be stored. Convergence to an optimum design is proved. Various illustrative examples are presented.

Methodology Statistics Theory Computation

Adaptive Variable Selection for Sequential Prediction in Multivariate Dynamic Models

92 - Isaac Lavine , Michael Lindon , 2019

We discuss Bayesian model uncertainty analysis and forecasting in sequential dynamic modeling of multivariate time series. The perspective is that of a decision-maker with a specific forecasting objective that guides thinking about relevant models. Based on formal Bayesian decision-theoretic reasoning, we develop a time-adaptive approach to exploring, weighting, combining and selecting models that differ in terms of predictive variables included. The adaptivity allows for changes in the sets of favored models over time, and is guided by the specific forecasting goals. A synthetic example illustrates how decision-guided variable selection differs from traditional Bayesian model uncertainty analysis and standard model averaging. An applied study in one motivating application of long-term macroeconomic forecasting highlights the utility of the new approach in terms of improving predictions as well as its ability to identify and interpret different sets of relevant models over time with respect to specific, defined forecasting goals.

Methodology Applications Computation

Sequential estimation for GEE with adaptive variables and subject selection

62 - Zimu Chen , Zhanfeng Wang , Yuan-chin Ivan Chang 2019

Modeling correlated or highly stratified multiple-response data becomes a common data analysis task due to modern data monitoring facilities and methods. Generalized estimating equations (GEE) is one of the popular statistical methods for analyzing this kind of data. In this paper, we present a sequential estimation procedure for obtaining GEE-based estimates. In addition to the conventional random sampling, the proposed method features adaptive subject recruiting and variable selection. Moreover, we equip our method with an adaptive shrinkage property so that it can decide the effective variables during the estimation procedure and build a confidence set with a pre-specified precision for the corresponding parameters. In addition to the statistical properties of the proposed procedure, we assess our method using both simulated data and real data sets.

Methodology

BIC extensions for order-constrained model selection

373 - Joris Mulder , Adrian E. Raftery 2018

The Schwarz or Bayesian information criterion (BIC) is one of the most widely used tools for model comparison in social science research. The BIC however is not suitable for evaluating models with order constraints on the parameters of interest. This paper explores two extensions of the BIC for evaluating order constrained models, one where a truncated unit information prior is used under the order-constrained model, and the other where a truncated local unit information prior is used. The first prior is centered around the maximum likelihood estimate and the latter prior is centered around a null value. Several analyses show that the order-constrained BIC based on the local unit information prior works better as an Occams razor for evaluating order-constrained models and results in lower error probabilities. The methodology based on the local unit information prior is implemented in the R package `BICpack which allows researchers to easily apply the method for order-constrained model selection. The usefulness of the methodology is illustrated using data from the European Values Study.

Methodology

Optimal Stopping and Worker Selection in Crowdsourcing: an Adaptive Sequential Probability Ratio Test Framework

79 - Xiaoou Li , Yunxiao Chen , Xi Chen 2017

In this paper, we aim at solving a class of multiple testing problems under the Bayesian sequential decision framework. Our motivating application comes from binary labeling tasks in crowdsourcing, where the requestor needs to simultaneously decide which worker to choose to provide the label and when to stop collecting labels under a certain budget constraint. We start with the binary hypothesis testing problem to determine the true label of a single object, and provide an optimal solution by casting it under the adaptive sequential probability ratio test (Ada-SPRT) framework. We characterize the structure of the optimal solution, i.e., optimal adaptive sequential design, which minimizes the Bayes risk through log-likelihood ratio statistic. We also develop a dynamic programming algorithm that can efficiently approximate the optimal solution. For the multiple testing problem, we further propose to adopt an empirical Bayes approach for estimating class priors and show that our method has an averaged loss that converges to the minimal Bayes risk under the true model. The experiments on both simulated and real data show the robustness of our method and its superiority in labeling accuracy as compared to several other recently proposed approaches.

Methodology